Alan Eldridge Sales Consultant Tableau Software Australia PRESENTED BY

PRESENTED BY
Alan Eldridge
Sales Consultant
Tableau Software Australia
What we will cover…
• Architecture
• Scalability & Availability
• Caching
©2011 Tableau Software Inc. All rights reserved.
Overview
©2011 Tableau Software Inc. All rights reserved.
extract
Information
consumers
(Tableau Reader)
connect
live
Tableau Desktop
Self-service visual
data analysis
Repository
Presentation
connect
live
Data
Server
Cache
extract
Security
Management/Automation
Tableau Server
Rapid fire business intelligence
Web & mobile
users
(Tableau Server
interactors)
Architecture
©2011 Tableau Software Inc. All rights reserved.
Terminology
• Process – an instance of a computer program that is being
executed. It has its own set of resources (e.g. memory) that
are not shared with other processes.
• Thread – a process may run multiple threads to perform
instructions in parallel. Threads within a process share
resources (e.g. memory).
• Server – a program running to serve the requests of other
programs. The term is also used to refer to a physical
computer dedicated to run one or more services.
©2011 Tableau Software Inc. All rights reserved.
Tableau Server
Data
Relational
OLAP
Files
Tableau Server
Data
Tableau Server
Clients
Relational
Browser/Mobile
OLAP
Files
Tableau Desktop
Command Line
Tools
Data
Relational
Receives incoming client requests and
directs them to the appropriate service for
action
Acts as a load balancer, routing traffic
round-robin to service instances
Single-process; multi-threaded
Files
Gateway/Load Balancer
Returns HTML responses to client
OLAP
Clients
Tableau Server
Browser/Mobile
Tableau Desktop
Command Line
Tools
Data
Relational
Clients
Tableau Server
Processes logins, content searches,
user/group/permission management, and
other tasks not related to visualizing data
Works in conjunction with data stored in
the Repository
Gateway/Load Balancer
Multi-server; multi-process; multi-threaded
OLAP
Browser/Mobile
Tableau Desktop
App
Server
Files
Command Line
Tools
Data
Clients
Tableau Server
Stores Tableau Server metadata: users,
group assignments, permissions, projects,
etc.
Relational
Also stores flat files used as data sources
Gateway/Load Balancer
Responds to queries from other services
when they need metadata
Has a SQL interface so external applications
can connect (read-only)
OLAP
Repository
Browser/Mobile
Tableau Desktop
App
Server
Command Line
Tools
Files
Active Directory
If used, verifies authentication in
conjunction with the App Server and
Repository
Data
Clients
Tableau Server
Provides same functionality as Tableau Desktop,
processing requests related to data visualisation
Includes built-in caching (more on this later…)
Relational
Multi-server; multi-process; multi-threaded
Repository
VizQL
Server
Gateway/Load Balancer
Data Source Drivers
OLAP
Native drivers need to be installed for each data
source (32-bit)
Browser/Mobile
Tableau Desktop
App
Server
Command Line
Tools
Files
Active Directory
Data
Clients
Tableau Server
Relational
Data Extract
Host
Gateway/Load Balancer
Data Source Drivers
OLAP
VizQL
Server
Browser/Mobile
Tableau Desktop
Invoked when a
visualisation including a
data extract is published
Stores and processes data
extracts
Repository
Multi-threaded; 64-bit
App
Server
Command Line
Tools
Files
Active Directory
Data
Clients
Tableau Server
Relational
Backgrounder
VizQL
Server
Controls tasks that ensure
Tableau Server is running
smoothly and efficiently
When the Data Extract Host
is used, also handles
scheduled data refreshes
Repository
Multi-server; multi-process
Gateway/Load Balancer
Data Source Drivers
OLAP
Data Extract
Host
Browser/Mobile
Tableau Desktop
App
Server
Command Line
Tools
Files
Active Directory
Data
Invoked when a data source is published
via Tableau Desktop
Data
Server
Serves as proxy between requests for
data and individual data sources
Relational
Backgrounder
Data Extract
Enables centralized metadata
Host
management for data sources and an
additional layer of access control
Allows centralized driver deployment
Allows multiple workbooks to us the
same data extract
VizQL
Server
Multi-server; multi-process; multithreaded; 32-bit
Repository
Gateway/Load Balancer
Data Source Drivers
OLAP
Clients
Tableau Server
Browser/Mobile
Tableau Desktop
App
Server
Command Line
Tools
Files
Active Directory
Data
Clients
Tableau Server
Data
Server
Relational
Backgrounder
VizQL
Server
Repository
Gateway/Load Balancer
Data Source Drivers
OLAP
Data Extract
Host
Browser/Mobile
Tableau Desktop
App
Server
Command Line
Tools
Files
Active Directory
Server Monitoring
What you see running…
©2011 Tableau Software Inc. All rights reserved.
Scalability & Availability
©2011 Tableau Software Inc. All rights reserved.
Terminology
• Scalability – scalability is about supporting multiple
simultaneous actions, not about making a single action faster.
• Availability – the ability of a solution to be resistant to
component failures. Increasing the availability of a solution will
increase the cost.
©2011 Tableau Software Inc. All rights reserved.
Terminology
• Scale Up – adding more resources (CPU, RAM, etc) to a
single server.
• Scale Out – adding more resources (CPU, RAM, etc) by
adding more servers in a “cluster”.
• Multi-Process – adding more throughput by running multiple
instances of a process or service. These can be on a single
server or can be distributed across multiple servers.
• Multi-Threaded – within a process, being able to perform
multiple tasks simultaneously across multiple CPUs.
©2011 Tableau Software Inc. All rights reserved.
Terminology
• Single Point of Failure – within a solution, a component that
if it fails will cause the solution to fail as a whole.
• Active/Active – when all instances of a multi-process service
will process requests.
• Active/Passive – when only some instances of a multiprocess service will process requests and the other instances
are only activated in the event of a component failure.
©2011 Tableau Software Inc. All rights reserved.
Server Scalability
Service
Multi-Process
Multi-Threaded
High Availability
VizQL Server
Yes
Yes
Active/Active
Data Server
Yes
Yes
Active/Active
Application Server
Yes
Yes
Active/Active
Backgrounder
Yes
No
Active/Active
Data Extract Host
No
Yes
Active/Passive
Repository
No
No
Active/Passive
Gateway
No
No
Manual Failover
©2011 Tableau Software Inc. All rights reserved.
Server Scalability
Primary Node
Gateway
Web Server
Application Server
2
VizQL Server
2
Data Server
2
Backgrounder
2
Active Extract Host
Active
Repository
Starting with a single server – everything is installed on one machine…
Server Scalability
Primary Node
Gateway
Web Server
Application Server
↑
VizQL Server
↑
Data Server
↑
Backgrounder
↑
Active Extract Host
Active
Repository
Scale up - add more resources and run more service instances if required.
Server Scalability
Server Scalability
Primary Node
Worker Node
Gateway
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Application Server
Active Extract Host
Active
Repository
Scale out – add a worker node running some or all of the services.
Server Scalability
Server Scalability
Primary Node
Worker Node
Gateway
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Application Server
Active Extract Host
Standby Extract Host
Active
Repository
Standby
Repository
Scale out – add a worker node running some or all of the services.
Server Scalability
Primary Node
Worker Node
Gateway
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Application Server
Failed Extract Host
Standby Extract Host
Failed
Repository
Standby
Repository
If the extract host or the repository fail…
Server Scalability
Primary Node
Worker Node
Gateway
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Application Server
Failed Extract Host
Active Extract Host
Failed
Repository
Active
Repository
… the standby will take over as the active.
Server Scalability
Primary Node
Worker Node
Gateway
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Application Server
Standby Extract Host
Active Extract Host
Standby
Repository
Active
Repository
When the failure is repaired, it starts up in standby mode.
Server Scalability
Primary Node
Worker Node
Gateway
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Web Server
Web Server
Application Server
Standby Extract Host
Active Extract Host
Standby
Repository
Active
Repository
Web Server
Backgrounder
↑
Worker Node
Worker nodes don’t need all the services – e.g. handling lots of extract refreshes…
Server Scalability
Primary Node
Worker Node
Web Server
Gateway
Worker Node
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Application Server
Standby Extract Host
Active Extract Host
Standby
Repository
Active
Repository
Separate the gateway and now our architecture starts to have HA properties…
Server Scalability
Primary Node
Worker Node
Worker Node
Web Server
Gateway
Web Server
Server
Down
Application Server
2
VizQL Server
2
Data Server
2
Backgrounder
2
Active Extract Host
Active
Repository
We can survive a total server failure of any worker node.
Server Scalability
Primary Node
Worker Node
Web Server
Active Gateway
Worker Node
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Web Server
Application Server
Standby Extract Host
Active Extract Host
Standby
Repository
Active
Repository
Failover Gateway
For full HA, we require a failover gateway server.
Server Scalability
Worker Node
Server
Down
Worker Node
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Web Server
Application Server
Standby Extract Host
Active Extract Host
Standby
Repository
Active
Repository
Failover Gateway
In the event of a gateway failure…
Server Scalability
Worker Node
Server
Down
Worker Node
2
Application Server
2
VizQL Server
2
VizQL Server
2
Data Server
2
Data Server
2
Backgrounder
2
Backgrounder
2
Web Server
Web Server
Web Server
Application Server
Standby Extract Host
Active Extract Host
Standby
Repository
Active
Repository
Active Gateway
Primary Node
… we activate the failover gateway, but this is not automatic.
Caching
©2011 Tableau Software Inc. All rights reserved.
Terminology
• Performance – the speed with which a single process can be
completed, assuming no contention for resources.
• Caching – a transparent store of values that have been
calculated so that future requests for the same data can be
serviced more quickly.
©2011 Tableau Software Inc. All rights reserved.
Caching
Dashboard
Gateway
1. Image Tile
Cache
VizQL
Server
Data Source
©2011 Tableau Software Inc. All rights reserved.
Image Tile Cache
• Dashboards are delivered to
the client as a series of
image “tiles” – these are
assembled to show the
complete dashboard.
• We can use this cache if:
• Same dashboard (duh!)
• No per-user security
• Same dashboard size
• Handled by the gateway
service, one per VizQL
worker node
©2011 Tableau Software Inc. All rights reserved.
Image Tile Cache
• There is one single, simple
step you can take to
maximise the reuse of image
tiles…
• Fixed size dashboards!
©2011 Tableau Software Inc. All rights reserved.
If We Miss the Image Tile Cache…
Dashboard
Gateway
1. Tile Cache
VizQL
Server
Data Source
2. Model
Cache
©2011 Tableau Software Inc. All rights reserved.
Model Cache
• When re-rendering the
dashboard we check to see
if computations have already
be done
• calculated fields, table
calculations, reference
lines, trend lines, etc
• We can use this cache if:
• No change to data
• No change to calcs
• Model cache is RAM based
per VizQL server instance
©2011 Tableau Software Inc. All rights reserved.
If We Miss the Model Cache…
Dashboard
Gateway
1. Tile Cache
VizQL
Server
2. Model
Cache
3. Query
Result Cache
Data Source
©2011 Tableau Software Inc. All rights reserved.
Query Result Cache
• The query result cache
holds the results from
queries we have sent to data
sources
• We can use this cache if:
• Dimensions and measures
are the same
• Filters are the same – this
means no per user security
• Cache has not expired or is
not explicitly bypassed
• Query result cache is RAM
based per VizQL server
instance
©2011 Tableau Software Inc. All rights reserved.
Managing Caching
©2011 Tableau Software Inc. All rights reserved.
Managing Caching
• Model cache
• vizqlserver.modelcachesize:30
• The number of models to cache, where there is one model per
viz instance in a workbook
• Query result cache
• vizqlserver.querycachesize:64
• The size in megabytes of query results to cache
©2011 Tableau Software Inc. All rights reserved.
Managing Caching
• Distributing components is a scalability strategy, not a
performance strategy
• Caching is per-process
• Distribution can hurt performance due to missing the cache
©2011 Tableau Software Inc. All rights reserved.
One Last Layer…
Dashboard
Gateway
1. Tile Cache
VizQL
Server
2. Model
Cache
3. Query
Result Cache
Data Extract
Data Source
©2011 Tableau Software Inc. All rights reserved.
Extracts
• Also can be used as a form
of cache to improve user
response times
• Using aggregated extracts
can improve performance
even further (at the sacrifice
of granularity)
• Can be scheduled to refresh
fully or incrementally
• Using the Data Server they
can be shared across
multiple workbooks
©2011 Tableau Software Inc. All rights reserved.
Extracts
©2011 Tableau Software Inc. All rights reserved.
Summary
• Tableau Server provides a flexible, scalable architecture that
(in general) can look after itself.
• Growing a Tableau Server installation to support more users
and data is simple and does not require deep technical skills.
• By understanding how Tableau Server’s caching mechanisms
work, we can design our dashboards for optimal performance.
• Unless you have a reason not to, make all dashboards fixed
size.
©2011 Tableau Software Inc. All rights reserved.
Q&A
©2011 Tableau Software Inc. All rights reserved.