Friday, November 29, 2013

Tableau Server 8 Certification Achieved

I'm very happy to have had the opportunity to take Tableau's certification exam for Tableau Server 8, and even happier to have passed it.

It's now my privilege to be able to use the Tableau Server 8 Certified logo, so here it is:

The exam is a good one. It's tough, long, comprehensive, and really exercises one's technical knowledge and ability to think about how Tableau Server works. It's long, at six hours, which is enough to cause a fair bit of anticipatory stress, and it's worth taking the full time allotment.

I want to extend my thanks to everyone I had the pleasure of working with in achieving this: Rebecca Nelson, John Cicero, Sarah Pierre-Louis, and Courtney Jacobsen. I was made welcome and comfortable, and really enjoyed meeting all of you.

Saturday, November 16, 2013

Tableau Server Performance Synopsis

This post simply compiles the high(ish) topics in the Tableau Server performance online help into a single place. This makes it easier to get a coherent overview of the range of considerations involved that by paging back and forth through a bunch of web pages. Or in the downloadable Server admin PDF.

The top level topics are links to their original source help pages, as are some of the subtopics.

General Performance Guidelines

Hardware and Software

Use a 64-bit operating system

Add more cores and memory

Configuration

Schedule refreshes for off-peak hours

Look at caching

Consider changing two session memory settings

VizQL session timeout limit

VizQL clear session

Assess your process configuration

When to Add Workers & Reconfigure

More than 100 concurrent users

Extracts

Heavy use of extracts

Frequent extract refreshes

Troubleshooting performance

Downtime potential

Improve Server Performance

What’s your goal?

Optimizing for Extracts

Optimizing for Users and Viewing

How Many Processes to Run

VizQL Server Process

Minimum number per deployment:

Maximum number per machine

Background Process

Data Engine and Repository Processes

Where to Configure Processes

Optimizing the Extracts and Workbooks

Assessing View Responsiveness

Examples

One-Machine Example: Extracts

Two-Machine Example: Extracts

Two-Machine Example: Viewing

Three-Machine Example: Extracts & Viewing

About Client-Side Rendering

The Tableau Server Processes

application server

VizQL Server

data server

repository

data engine

background

Create a Performance Recording

Use performance workbooks to analyze and troubleshoot performance issues pertaining to different events that are known to affect performance, including:

Query execution

Geocoding

Connections to data sources

Layout computations

Extract generation

Blending data

Server blending (Tableau Server only)

Create a Performance Recording in Tableau Server

Interpret a Performance Recording

Timeline

Events

Computing layouts.

Connecting to data source.

Executing query.

Generating extract.

Geocoding.

Blending data.

Server rendering.

You can speed up server rendering by running additional VizQL Server processes on additional machines.

Query.

Resolving Tableau Server Permissions

Do you find puzzling out Tableau Server permissions confusing and mysterious? You're not alone.

I put this post together to help me figure out the process of how Tableau Server determines a User's permissions for a particular Workbook, Dashboard, or Worksheet. To my mind, the Tableau documentation is a bit twisty and hard to trace. It also doesn't surface the critical part that it's not always the view's permissions that are used, but those of the view's Workbook.

It's a work in progress. I plan on improving it as I work through the factors, interactions, dependencies, etc.

Factors affecting Permissions

License Level
see reference: Tableau Server Admin guide online

Unlicensed
users cannot connect per the TS doc
? should it therefore be impossible to assign permissions to an unlicensed user?
Viewers
cannot be assigned permissions other than 'View', 'Add Comments', and 'View comments'
Interactors
can be assigned any permissions
Guests
'users without an account on the server see and interact with an embedded view. When enabled, the user can load a webpage containing an embedded visualization without logging in. This option is only available with a core-based server.'
from the TS Admin Guide | About Enable Guest & Enable Automatic Login: "Enable Guest is a setting on the Maintenance page that can be selected if you have a core-based server license... users click a link and they go directly to the view with no login... no authentication is performed. The Tableau Server Guest User account is used to access the server, but as long as Enable Guest is selected, anyone can use it. Administrators often limit the capabilities of the Guest User account. For example, they might edit the permissions of certain views so that Guest User is denied access.

User Rights
see reference: Tableau Server Admin guide online

There are two distinct but inter-related 'things' Tableau lumps together as User Rights.

Publish
if designated as a Publisher, the user can: "connect to Tableau Server from Tableau Desktop in order to publish and download workbooks and data sources."

There are two configuration options for Publish:

Allow
provides the User the ability noted above, although the online doc doesn't explicitly enumerate this.
NOTE: as of TSv8.1b7 it's possible to assign "Allow" for an unlicensed Site user.
AND: this unlicensed user with Publish rights CAN successfully publish to Tableau Server.
Deny
similarly, although not explicitly in the doc, presumably when Publish is denied the User cannot publish or download Workbooks and Data Sources.
? If Publish is set to 'Deny', can the User be assigned any of the download permissions on individual objects, and if so, what would be the result?

Admin
Prerequisites in order for a user to be an admin s/he must be an Interactor with Publish granted.

Site Admin
"Can manage groups, projects, workbooks, and data connections. By default, site administrators can also add users and assign user rights and license levels but a system administrator can disable that (see Editing Sites)"
Server Admin
"all the rights of a site administrator, plus they can license unlicensed users, control whether site administrators can add users, create additional system administrators, and they can administer the server itself. This includes handling maintenance, settings, schedules, and the search index"
None
the user is not an admin.

There is an interesting asymmetry in the mechanisms of assigning User Rights. In my testing with Tableau Server v8.1 beta 7, when adding a new User I try to make it an Interactor and the Interactor license level isn't granted because the # of licensed users has been reached:

when checking the "Publish" User Right right, and that user subsequently becomes licensed as an Interactor the Publishing right is preserved;

however, when checking "Site Administrator", subsequently licensing the user as an Interactor doesn't preserve the "Site Adminstrator" in the same manner as was "Publish".

User Identity
see references in the Tableau Server Admin Guide (online):
Set Permissions for a Project
Set Permissions for Workbooks and Views
Set Permissions for a Data Source

Things get really conplicated with the introduction of User Identity. There are three distinct facets to a User's identity vis-a-vis Permissions:

The Individual
The User, identified by their User id. Permissions are always resolved to the User; how they get resolved is the question.
Roles
are bundles of permissions that can be associated with Users and Groups for specific Tableau Server assets (which begs the question: what's a Tableau Server asset?)
Groups
Users can have membership in zero, one, or more Groups. Asset Permissions may be individually associated to Groups, or Roles may associate bundles of Permissions.

One of the big complicating factors in determining whether a given permission is granted or denied to a particular User for a particular Tableau Server asset is the different relationships between the structural and permission-resolution relationships between Users, Roles, and Groups.

Users may belong to one or more Groups at the Site level.

Users and Groups may be associated with zero or one Role for a particular asset.

When Tableau Server determines individual Permissions' status for a user for a particular asset it assesses, in order, the Permissions' status for:
— the User;
— any Role the User is associated with for that asset;
— the permission status for that asset for any Groups to which the User belongs.

How Permissions Are Set – The Tableau Server Admin Guide Flowchart
redrawn for consistent Yes/No sequence and highlighting of Roles and Groups influence.

image/svg+xml UserDenied? Yes Denied No User inAllowed? Yes No User in Denied? Yes Denied No User in Allowed? Yes No Role Group Group Denied http://onlinehelp.tableausoftware.com/v8.0/server/en-us/help.htm#license_permissions_backgrnd.htm When resolving the permissions in place for a Dashboard or Worksheet (view), the object usedto evaluate the permissions is either the view or the Workbook the view is contained in. If the Workbook was published to show the its as tabs, the Workbook's permissions are used.If the Workbook was not published to show the its as tabs, the view's permissions are used. Yes No Was the Workbook published showing sheets as tabs? Workbook View Once the source of Permissions (Workbook or View) has been determined, this process resolves whether or not the User is granted the permission: Note: this chart does not represent the situation where a User has been explicitly grantedthe "Allow" permission. Source: use the use the the view is in

The permissions chart above is in SVG and was created using Inkscape.

Friday, November 15, 2013

Tableau Server Processes — Tableauing Tableau

Here's a Tableau Public workbook with a couple of dashboards presenting views of the Tableau Online help information about the Tableau Server processes.

I found that by listing the different performance characteristics and cross-referencing them to the processes I was able to get a new perspective on what the do, and in particular what their limitations and potential overload scenarios are. It's a big help to look down the list and see what might come up, and then see what process might be involved.

tabadmin set commands — Tableauing Tableau

The Tableau Public workbook below contains dashboards I put together to help me organize and interpret the various options available for the Tableau Server tabadmin set command.

Although the set command information is available in Tableau's online help here it's in a static HTML table, and I find it a lot more useful to have it in data so I can use Tableau to organize, filter, and reshuffle it in various ways when building my mental map of how the options relate to one another.

You can use the dashboards from Tableau public or download the workbook. Or if you're interested in how I got the online help content into data I've put that information below.

Rendering the online help table as data.
Since the HTML table is no-frills you can simply copy and paste it into Tableau. This works perfectly well since there are no HTML tags inside the table's cells to confuse Tableau. But its wasn't quite what I was after.

There are links in the options' descriptions that the copy/paste into Tableau method doesn't capture. I thought it helpful to have these so I whipped up a little Ruby script to parse the HTML and extract everything into a CSV file. And since the Jet engine hasn't yet been replaced (but soon, soon) I split the descriptions into 250-character sections, and recombined them with a calculated field after extracting the data into a TDE.

Using the Ruby script has several dependencies:

  • it's a Ruby script so your machine must be able to run Ruby, and you must have permissions to do so (sometimes not so easy in a corporate environment);
  • it uses some non-standard Ruby gems (libraries) so these need to be installed on your system – this is usually very straightforward and a quick Google will show the way;
  • you really should be comfortable with this level of technical stuff – if you're not there's likely someone nearby who can help, or you can always contact me.


# TTC_TabadminHelpToDataTv8.rb - this Ruby script Copyright 2013, Christopher Gerrard require 'nokogiri' require 'open-uri' $recNum = 0 $Tv8HelpRoot = 'http://onlinehelp.tableausoftware.com/v8.0/server/en-us/' $CSVOptionsHeader = 'Category,Option,Default Value,Description1,Description2,Link1,Link1Label,Link2,Link2Label' $CSVOptionsFormat = "\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\"" def init $fOpts = File.open("TTC_tabadminSetOptionsTv8.csv", 'w') $fOpts.puts $CSVOptionsHeader unless $fOpts.nil? end def cleanTxt txt return txt.gsub(/\\n/,' ').gsub(/"/,'""').strip end def pullCmds helpFile doc = Nokogiri::XML(open(helpFile)) cmdTable = doc.xpath('.//contents/body/div/table/tbody') cmdTable.each do |t| cmdRows = t.xpath('.//tr') cmdRows.each do |r| tds = r.xpath('.//td') option = tds[0].text.gsub(/\\n/,' ').strip category = option.split('.')[0] default = tds[1].text.gsub(/\\n/,' ').strip desc = tds[2].text.gsub(/\\n/,' ').strip desc1 = desc[0..250] desc2 = desc[251..501] links = tds[2].xpath('.//a') link1 = if links[0].nil? then '' else $Tv8HelpRoot + links[0].xpath('./@href').text.gsub(/\\n/,' ').strip end link2 = if links[1].nil? then '' else $Tv8HelpRoot + links[1].xpath('./@href').text.gsub(/\\n/,' ').strip end l1l = if links[0].nil? then '' else links[0].text end l2l = if links[1].nil? then '' else links[1].text end $fOpts.puts $CSVOptionsFormat % [category,option, default, desc1, desc2, link1, l1l, link2, l2l] end end end init pullCmds 'tabadmin set options cleaned.xml' #NOTE: the Tableau online help page has been saved locally as an XML file and cleaned up a bit $fOpts.close unless $fOpts.nil?

What TTC_TabadminHelpToDataTv8.rb does.

  • It accesses each row in the table as a separate set command option.
  • The first column contains the option's name.
  • The second column contains the option's default value.
  • The third column contains the option's description, which may exceed 255 charcters and contains zero, one, or two links, so the description is processed thus:
    • the description is split into two parts, each stored as its own field;
    • the links, if any, are captured as both a URL and label;
  • The fields are written into the CSV file.

How to use TTC_TabadminHelpToDataTv8.rb

  • Prerequisites
    • Minimal technical skills.
    • Have Ruby installed and ready to run.
    • Have the Nokogiri Ruby gem installed—it's used in the XML parsing.
    • Have the open-uri Ruby gem installed.
    • Have TTC_TabadminHelpToDataTv8.rb in place—it doesn't matter where, or what name you use, as long as you know where it is.
      You can copy the code above and paste it into your favourite text editor.
  • Running TTC_TabadminHelpToDataTv8.rb
    • Open a command prompt.
      (you can run it otherwise, but this is simple and straightforward)
    • CD to the directory containing the XML file you captured the online help page into.
    • Run it: "[path to]\ruby    [path to]\TTC_TabadminHelpToDataTv8.rb"
  • Presto. You now have a CSV file containing the tabadmin set command options as data.

The usual caveats.

TTC_TabadminHelpToDataTv8.rb works fine for me. But I wrote it and prepared the XML file it parses.

I hope it works for you, but make no guarantees. If you do use it and make improvements I hope that you'll post them back here as comments so I can learn from them, and hopefully other people can benefit from them too.

Tuesday, November 5, 2013

Precision Inputs Required In Addition To Analog Controls

Here's a friction point that's pretty simple and straightforward on the surface, but whose reach and roots are surprisingly broad and deep.

General Principle
Whenever Tableau provides the ability to configure an element's property value it should provide a mechanism for the User to specify a precise value. The values the User can enter shall be subject to the domain requirements of the element being configured, i.e. of the appropriate type and limitations on value.

Specific Example – Viz Field Size
When configuring the Size for a field in the Marks card, the User should have the opportunity to enter a precise value in addition to being able to adjust the position of the Size control slider, which is the only current adjustable mechanism.

In the specific case of the Marks Card's Size control, Tableau only provides the horizontal slider, which isn't good for precisely specifying the size value. Contrast this with the Color Transparency configuration, which provides an numeric(%) input field along with a synchronized slider which enables a level of precision in specifying the Transparency value unavailable in the Size configuration.

This image shows the Color Transparency and Size configuration controls.

The inability of the User to precisely specify a numeric value for Size may not seem like a big deal—the slider lets you adjust the Size value simply and quickly. You may think to yourself: "This is not a big problem, why make a fuss?"

I'm glad you asked.

It's simple: small variations in Size, e.g. bar widths have significant impacts in cognition. It's a primary reason we visualize data using geometric properties.

Suppose you have multiple renderings of the same base chart in a dashboard. It's important for them to be as visually consistent as possible so that variations in the data are easy to spot. With Tableau's current design it's almost impossible to ensure precisely the same useful Size configuration across the worksheets unless it's either the minimum, middle, or maximum value. (the min and max are the end points, the middle has a 'detent' feature)

This situation arose when I was creating a dashboard for a very particular client; in working with them even tiny variances caused hesitation and a "That's not quite right." reaction. I ended up hacking the TWB to make sure that the individual size values were the same, but that's not a realistic solution for most people, nor should it be.

An Improved Design

This Size configuration design adopts the Color Transparency slider/input control combination. This provides the flexibility and specificity we're looking for.

There's the additional benefit of function/presentation similarity—having the same mechanism for configuring similar things eliminates the cognitive impedance imposed when the User needs to switch mental gears to adjust to different ways of doing the same thing. Since Color Transparencey and Size aren't simultaneously visible this is currently a subliminal discontinuity, which in some regards is more perplexing.

Input Constraints

Configuration inputs need constraints on the user-entered values to ensure that only legitimate values get applied. For example, it makes no sense to set the Size for a bar chart to "Guy Fawkes Day". Tableau Parameters implement a reasonable starting point for constraints. Examples of constraints include:

  • type, e.g.: numeric – integer, real, percentage, etc.; string; boolean; date & datetime
  • range, if applicable, including less then, greater than, from-to. not equal to, etc;
  • set of allowable values, either enumerated or data-dependent, including set membership, not in set; etc.
  • others (this isn't intended to be an exhaustive survey of constraint elements)

Existing configuration controls like the Size slider have the constraints built in—the User can't move the slider past either end so the Size value, and therefore the bars' width, cannot exceed the minimum or maximum values.

Example: Size configuration

The question is: "What should the Size value indicate, and what should its range be?"

As shown below, the minimum Size value corresponds to very narrow but visible bars and the maximum value correponds to bars whose width spans the full width of the row,

As shown above, the new Size input control is at maximum—100%. Whether sizing on a percentage scale is appropriate is a legitimate consideration, one that needs to take into account the full spectrum of potential Size configuration scenarios. But for now it seems reasonable that Size can be a percentage scale, with possibly a floor value of 1%—it's not clear whether or not the current

Current Size Range
This dashboard has three versions of the same chart, with the Size adjusted, in left-right order, to its leftmost (min), middle (mid), and rightmost (max) positions.

I've included the actual TWB Size values below the charts, as we see, the min value is slightly less than one percent, mid is precisely 1 and max is precisely 2.

About the min value: many of Tableau's internal values are odd in this high-precision fashion, which helps understand some oddities, but that's a topic for another post.

As we see here, the range of Size values is fairly limited, from slightly more than zero to two. It's not at all clear why Tableau chose this range, or whether it would be meaningful and useful to a User trying to configure Size—my initial feeling is that it wouldn't be.

A percentage range from 0-100%, or 1-100% seems like a good candidate Size range. It feels likely that a correspondence of zero to effectively an invisible point and/or one percent to one pixel (or whatever the size unit is) would provide a robust range, and the use of integers for the percentage values avoids the messiness associated with real numbers. It's also easy to accommodate values greater than 100%, although I'm not clear on whether this makes sense.

Extending Configuration Functionality

Parameterization

All configurable elements should be responsive to parameterized values which can be data fields, calculated fields, or Parameters. These values must conform to the appropriate element-specific constraints, .

In addition to its configuration benefits adding this functionality cracks opens the door to Tableau being able to analyze data it's currently unable to interpret and present. This data is variously called post-relational, non-tabular, NoSQL, etc. (I'm working on posts covering this.) It also brings in the longstanding discussions about the relationship between Parameters and data, particularly the community's desire for Parameter values to be dynamically populated by the current data.

An example of this extension's value is in Axis configuration, where there are two levels of configuration and multiple configuration values. In the first level are the configurations for Automatic, Fixed, Uniform, or Independent row and column axes. The second level configuration values are Start and End for Fixed range axes; providing for data-based Start and End values provides the opportunity to fine tune data-sensitive presentations that are impossible with the current fixed-value configuration.

Expanded Data Access

Configuration Files
It would be extremely helpful if Tableau could digest common forms of configuration data, e.g. XML, INI, JSON, and YAML. Consuming these would provide the ability to establish common sets of configuration values, which could among other benefits be the platform for uniform styling and branding.

Configuration files are notably different from the data files Tableau was designed to work with. One major difference is that the config files are (roughly) organized around a parameter per line/element structure while data files are collections of records, each with its own value for each of the enumerated fields.

Tableau already consumes at least XML and YAML files. TWBs are XML and Tableau Server uses YAML for configuration information, so it doesn't seem to be that big a technical reach to implement their use for configuration information.

Data Analysis Consequences
Starting with Tableau's use of different data structures for configuration information, it's a relatively straight path to accessing these new data sources for traditional Tableau analysis.

But although it's a straight path it's not simple and trivial. There are multiple challenges involved, with subtleties, complexities, and consequences that make providing the necessary general data modeling and mapping of complex data to Tableau's analytical semantics a real and interesting challenge.

Simultaneous Multi-Element Configuration

This will be a big step forward. Providing the ability to configure a property for multiple elements at the same time will be a tremendous boost to Tableau Users' productivity, and to a lesser degree output quality. For example, it will reduce and in many cases eliminate the drugery and error-prone tedium of switching back and forth between multple worksheets and dashboards to make sure that they're consistent.

But it's not a simple and easy change. Tableau's not set up for this operational mode, and there are deep and serious implications that reach to the core of the mechanisms necessary for presenting the configurable elements and in providing sensible means to configure them.

The good news is that the mechanisms, models, and functionalities required to implement this ability are those required to model, manage, and analyze complex non-tablular data, so there is a virtuous positive feedback loop between this feature and accessing complex data for configuration and traditional Tableau analysis.