Tableau For The Enterprise - QlikView, Tableau, Business .

Transcription

Tableau for theEnterprise:An Overview for ITAuthors: Marc Rueter, Senior Director, Strategic SolutionsEllie Fields, Senior Director, Product MarketingMay 2012

Tableau for the Enterprisep2Introductiondesktop-installed software. Tableau Desktop is theA new generation of business intelligence software putsshared views on Tableau Server.authoring and publishing tool that is used to createdata into the hands of the people who need it. Slow, rigidsystems are no longer good enough for business usersor the IT teams that support them. Competitivepressures and new sources of data are creating newrequirements. Users are demanding the ability to answertheir questions quickly and easily. And that’s a goodthing.Tableau Software was founded on the idea that dataanalysis and subsequent reports should not be isolatedactivities but should be integrated into a single visualFigure 1: Tableau provides a scalable solution forcreation and delivery of web, mobile and desktopanalytics.analysis process—one that lets users quickly seeTableau Server is an enterprise-class businesspatterns in their data and shift views on the fly to followanalytics platform that can scale up to hundreds oftheir train of thought. Tableau combines data explorationthousands of users. It offers powerful mobile andand data visualization in an easy-to-use application thatbrowser-based analytics and works with a company’sanyone can learn quickly. Anyone comfortable with Excelexisting data strategy and security protocols. Tableaucan create rich, interactive analyses and powerfulServer:dashboards and them share them securely across the Scales up: Is multi-threadedcentrally, control permissions and scale up to enterprise- Scales out: Is multi-process enabledwide deployments. Provides integrated clusteringThis overview is designed to answer questions common Supports High Availability Is secure Runs on both physical and Virtual Machinesenterprise. IT teams can manage data and metadatato IT managers and administrators and help themsupport Tableau deployments of any size. In thisdocument we cover: Tableau Architecture Deployment Models Security Scalability System Administration Data Strategy Metadata Management Mobile DeploymentArchitectureTableau has a highly scalable, n-tier client-serverarchitecture that serves mobile clients, web clients andThe following diagram shows Tableau Server’sarchitecture:Figure 2:Tableau Server architecture supports fastand flexible deployments.

Below, we explain each of the layers, starting withIn-memory: Tableau offers a fast, in-memory Datacustomer data.Engine that is optimized for analytics. You can connectData Layerto your data and then, with one click, extract your datato bring it in-memory in Tableau. Tableau’s Data EngineOne of the fundamental characteristics of Tableau isfully utilizes your entire system to achieve fast querythat it supports your choice of data architecture.response on hundreds of millions of rows of data onTableau does not require your data to be stored in anycommodity hardware. Because the Data Engine cansingle system, proprietary or otherwise. Mostaccess disk storage as well as RAM and cacheorganizations have a heterogeneous data environment:memory, it is not limited by the amount of memory on adata warehouses live alongside databases and Cubes,system. There is no requirement that an entire data setand flat files like Excel are still very much in use.be loaded into memory to achieve its performanceTableau can work with all of these simultaneously. Yougoals.do not need to bring all your data in-memory unless youchoose to. If your existing data platforms are fast andTableau Server Componentsscalable, Tableau allows you to directly leverage yourThe work of Tableau Server is handled with theinvestment by utilizing the power of the database tofollowing four server processes:answer questions. If this is not the case, Tableauprovides easy options to upgrade your data to be fastand responsive with our fast in-memory Data Engine.Data ConnectorsTableau includes a number of optimized dataconnectors for databases such as Microsoft Excel, SQLServer, Oracle, Teradata, Vertica, Cloudera Hadoop,Application Server: Application Server processes(wgserver.exe) handle browsing and permissions for theTableau Server web and mobile interfaces. When auser opens a view in a client device, that user starts asession on Tableau Server. This means that anApplication Server thread starts and checks thepermissions for that user and that view.and many more. There is also a generic ODBCVizQL Server: Once a view is opened, the client sendsconnector for any systems without a native connector.a request to the VizQL process (vizqlserver.exe). TheTableau provides two modes for interacting with data:VizQL process then sends queries directly to the dataLive connection or In-memory. Users can switchsource, returning a result set that is rendered as imagesbetween a live and in-memory connection as theyand presented to the user. Each VizQL Server has itschoose.own cache that can be shared across multiple users.Live connection: Tableau’s data connectors leverageData Server: The Tableau Data Server lets youyour existing data infrastructure by sending dynamiccentrally manage and store Tableau data sources. ItSQL or MDX statements directly to the source databasealso maintains metadata from Tableau Desktop, such asrather than importing all the data. This means that ifcalculations, definitions, and groups. The publishedyou’ve invested in a fast, analytics-optimized databasedata source can be based on:like Vertica, you can gain the benefits of that investmentby connecting live to your data. This leaves the detaildata in the source system and send the aggregateresults of queries to Tableau. Additionally, this meansthat Tableau can effectively utilize unlimited amounts ofdata – in fact Tableau is the front-end analytics client tomany of the largest databases in the world. Tableauhas optimized each connector to take advantage of theunique characteristics of each data source. A Tableau Data Engine extract A live connection to a relational database (cubesare not supported)Read more about the Data Server in the section DataStrategy below.Backgrounder: The backgrounder refreshes scheduledextracts and manages other background tasks.Tableau for the Enterprisep3

Tableau for the Enterprisep4Gateway/ Load BalancerThe Gateway is the primary Tableau Server that routesrequests to other components. Requests that come infrom the client first hit the gateway server and are routedto the appropriate process. If multiple processes areconfigured for any component, the Gateway will act as aload balancer and distribute the requests to theprocesses. In a single-server configuration, allprocesses sit on the Gateway, or primary server. Whenconnect to any published data sources, whetherpublished as an extract or a live connection.Deployment ModelsTableau can be configured in a variety of ways dependingon your data infrastructure, user load and usage profile,device strategy, and goals. Tableau Server can beclustered with any number of machines. Below are sixrunning in a distributed environment, one physicalexamples of common configurations.machine is designated the primary server and the othersSimple Configurationare designated as worker servers which can run anynumber of other processes. Tableau Server always usesonly one machine as the primary server.Clients: Web Browsers and Mobile AppsTableau Server provides interactive dashboards to usersvia zero-footprint HTML and JavaScript (AJAX) in a webbrowser, or natively via a mobile app. No plug-ins orhelper applications are required. Tableau Serversupports: recommended minimum hardware configuration, 8 CPUcores and 32GB of main memory, will provide goodperformance. This type of configuration is useful for aproof of concept for a larger deployment, or for adepartmental server. Tableau recommends running twoinstances of each major process: Data Server,Application Server, VizQL Server and Backgrounder on asingle server 8-core deployment of Tableau Server.Web browsers: Internet Explorer, Firefox, Chrome3-Server (24-Core) Clusterand SafariEnvironments with heavier user load will requireMobile Safari: Touch-optimized views areclustering additional servers. In a 3-Server configuration,automatically served on mobile Safarithe Gateway or primary machine will host theiPad app: Native iPad application that providesApplication Server requests to the worker machines. Antouch-optimized views and content browsing For many customers, a single server with theBackgrounder, Repository and Extract Host, and will sendadministrator can configure the number and type ofAndroid app: Native Android application that providesprocessing running in the system to support heavy ortouch-optimized views and content browsinglight extract usage and other characteristics.Android browser: Touch-optimized views areautomatically offered in the Android browserClients: Tableau DesktopTableau Desktop is the rapid-fire authoring environmentused to create and publish views, reports anddashboards to Tableau Server. Using Tableau Desktop, areport author can connect to multiple data sources,explore relationships, create dashboards, modifymetadata, and finally publish a completed workbook ordata source to Tableau Server. Tableau Desktop can alsoopen any workbooks published on Tableau Server orFigure 3: Configuring of a simple 3-server cluster.5-Server 40-Core ClusterMore worker machines can be added to a cluster tosupport heavier data usage or higher user load. In alarger cluster using data extracts, you might choose to

isolate the repository and extract host on one machine,Tableau’s enterprise-level security features managethe backgrounders on another, and let the VizQL andauthentication, permission, data and network security.Application servers reside on the other worker machines.Together, these capabilities provide a complete securityDifferent configurations are available to support differentsolution that will serve the needs of a broad and diverseworkload profiles.user base, whether internal to the organization or externalon the Internet. In fact, Tableau Server has passed thestringent security requirements of customers in thefinancial services, government, and healthcare sectors.Authentication – Access SecurityThe first level of security is to establish the user’s identity.Figure 4: Configuration of a 5-server clusteroptimized for many data extracts.High Availability ClusterTableau’s High Availability feature helps IT organizationsmeet SLAs and minimize downtime. Tableau’s HighAvailability solution provides automatic failovercapabilities for the repository and data engineThis is done to prevent unauthorized access and topersonalize each user’s experience. This process istypically referred to as ‘authentication’. It should not beconfused with ‘authorization’ which is covered in thesection titled ‘Permissions – Object Security’.Tableau Server supports three types of authenticationplus the option to allow anonymous (un-authenticated)access to the system:components. A minimum of 3 nodes are required in a1.Microsoft Active Directory,High Availability environment. A primary node serving as2.Local authentication managed by Tableau Server,3.Trusted authentication that creates a trustedthe Gateway and load balancer and 2 additional nodeshosting the active processes. Gateway failover is amanual step. On failover, Tableau Server sends emailalerts to specified administrators.Virtual Machine or Cloud-Based DeploymentThere are no special considerations when runningrelationship between Tableau Server and one ormore web servers.Tableau provides automatic login timeouts that can beconfigured by administrators.Tableau Server on virtual machines or in a cloudRoles and Permissions - Object Securitydeployment. Virtual machines can be used to virtuallyIn Tableau, a role is a set of permissions that is applied tolimit the number of cores available to Tableau Server or tocontent to manage how users and groups can interactprovide disaster recovery via the virtual machine itself.with objects such as projects and published content.When running Tableau in the cloud, bear in mind thatPublished content such as data sources, workbooks, andTableau Server requires static IP addresses.views, can be managed with permissions for the typicalSecurityactions of view, create, modify, and delete. Projectscontrol the default permissions for all workbooks andviews published to the project. Administrators can createAs organizations make more data accessible to moregroups such as “Finance Users” to make permissionpeople, information security becomes a critical concern.management easier. The use of projects can be used onTableau Server provides comprehensive securitya single server where support for multiple external partiessolutions that balance the variety of sophisticated(multi-tenants) is needed.requirements with easy implementation and use.Tableau for the Enterprisep5

Tableau for the Enterprisep6Roles provide a default permission structure toThere are three main network interfaces to the Tableaudifferentiate users. For example, a user may be assignedServer, though Tableau pays special attention to thethe role of Interactor for a particular view, but not for allstorage and transmissions of passwords at all layers andcontent. And, a user with a Viewer role can see ainterfaces.particular view but does not have the ability to change theview. There are over 20 parameterized customizations1.standard HTTP requests and responses but can beavailable to help manage object security. These role-configures for HTTPS (SSL) with customer suppliedbased permissions do not control what data will appearsecurity certificates.inside of a view.Data – Data Security2.requirements or who deliver content externally. Tableauoffers flexibility in helping organizations meet their datasecurity needs in three different ways: implement thesecurity solely in the database, implement security solelyin Tableau, or create a hybrid method where userTableau Server-to-database uses native driverswhenever possible and uses generic ODBC adaptersData security is becoming increasingly important,especially for organizations needing to meet regulatoryThe client-to-Tableau Server interface defaults towhen native drivers are not available.3.Secure communication between Tableau Servercomponents is only applicable in distributeddeployments and is done using a stringent trustmodel to ensure each server receives valid requestsfrom other servers in the cluster.information in Tableau Server has corresponding dataIn addition to these network interface securityelements in the database.capabilities, Tableau provides additional safeguards.When a user logs into the Tableau Server, they are notlogging into the database. This means that TableauServer users will need to have credentials to log in to thedatabase in order for the database level security to beapplied for them. These login credentials can be passedusing Windows Integrated Security (NT Authentication),embedding the credentials into the view when published,or prompting for specific user credentials.Tableau also provides a User Filter capability that canenable row-level data security using the username,group, or other attributes of the current user. The filterappends all queries with a ‘where’ clause to restrict thedata and can be used with all data sources.There are a variety of encryption techniques to ensuresecurity from browser to server tier to repository andback, even when SSL is not enabled. Tableau also hasmany built-in security mechanisms to help preventspoofing, hi-jacking, and SQL injection attacks, andactively tests and responds to new threats with monthlyupdates.ScalabilityTableau Server is highly scalable, serving the largestenterprises and up to tens of thousands of users.General Motors, Wells Fargo, eBay and Bank of Americaare some organizations that are using Tableau. RayWhite, a large real estate company, uses Tableau toNetwork – Transmission Securityserve reports to 10,000 real estate agents.For many internal deployments, network security isSince 2009 Tableau Server has been running at a highprovided by preventing access to the network as a whole.scale in Tableau’s own data centers to power TableauHowever, even in these cases it is important to securelyPublic, a free service for online visualization of publictransmit credentials across the network. For externaldata. Tableau Public supports over 20 million distinctdeployments, transmission security is often critical tousers, and as of April 2012 was serving up over 800,000protect sensitive data, credentials and to preventviews per week. In fact, Tableau Public hit the record ofmalicious use of Tableau Server.over 94,000 views in one hour in late 2010.

For optimal performance, it is best to set the number ofapplication and VizQL processes to be equal.Best Practice OptimizationsIn addition to an environment that is optimally designed,there are best practices that can be used to greatlyimprove performance and reduce the average responsetime.Use extracts – Extracts store data locally so that userscan access the data without making direct requests to thedatabase. They can be easily filtered when users don’tneed the detail, significantly improving response time. IfFigure 5: Tableau Server has scaled up toextract databases become too large, they can beextremely high loads as the infrastructure foroffloaded to a local extract engine on a separateTableau Public.Every environment is unique and there are manymachine.variables that impact performance. Factors that affect thescalability of a Tableau deployment include: Hardware considerations: Server type, disk speed,amount of memory, processor speed, and number ofprocessors. More is better. Architecture: Number of servers, architecture design,network speed/traffic, data source type, and location. Usage: Workbook complexity, concurrent useractivity, and data caching. Software configuration: Configuration settings ofTableau Server. Data: Data volumes, database type, and databaseconfiguration.Schedule updates during off-peak times – Often, datasources are being updated in real-time but users onlyneed data daily or weekly. Scheduling extracts foroff-peak hours can reduce peak-time load on both thedatabase and Tableau Server.Avoid ‘expensive’ operations during peak times– Logging in and publishing, especially large files, are twovery resource-consuming tasks. While it may be difficultto influence login behavior, it’s often easy to influencepublishing behavior. Ask users to publish during off-peakhours, avoiding busy times like Monday mornings.Cache views – As multiple users begin to accessTableau Server, the response time will initially increasedue to the contention for shared resources. However, aseach request comes into the system, the view can beWe periodically perform scalability tests on Tableaucached and renders much more quickly when the nextServer. Ask your Tableau Account Manager for the mostuser requests the same view.recent scalability test results.System AdministrationEnvironmental OptimizationsA single machine will be the fastest configuration forsmall user counts as adding additional machinesintroduces latency between the machines. The exact loadat which distributed configuration becomes moreimportant will differ based on the load type, networkspeed, hardware, database performance, and othervariables.The process of system governance and the role of theTableau Server administrator are much like that of anyother application. However, in Tableau, administratorscan be assigned to either System or ContentAdministrator roles. System Administrators havecomplete access to all software and functions withinTableau server. They can then assign select users to theTableau for the Enterprisep7

Tableau for the Enterprisep8role of Content Administrator who will manage users, Multi-dimensional

Tableau Server Components The work of Tableau Server is handled with the following four server processes: Application Server: Application Server processes (wgserver.exe) handle browsing and permissions for the Tableau Server web and mobile interfaces. When a user opens a view in a cli