Wednesday, February 11, 2015

Metadata Management - Consistent Commenting

The situation: a multitude of workbooks.

It's easy to make lots and lots of Tableau workbooks. In fact, they tend to sprout up all over the place. This is a good thing. Tableau is most valuable when it's being thoroughly exercised, when people are putting it through its paces, exploring data, trying this and that, experimenting with approaches and techniques.

On the other hand, this very wonderfulness has its consequences and costs. This post is about one of them, and a solution to the problem that arises. (problem here in the technical sense, without any value judgement)

The problem: inconsistent data semantics.

There are multiple connections to the same data scattered through the workbooks, and there's a business need to make sure that the data's semantics are consistently presented to the workbooks' users via informative comments on the Tableau connections' fields.

The solution: make the data semantics—comments consistent.

This post presents a way to make sure that all of the references to the same field in a set of workbooks have the same comments.

The short version is that it uses a standard Tableau workbook containing the appropriate comments and a Ruby script to apply those comments to the fields in other workbooks. The details follow.

A connection without comments.

A connection with comments.

The mouse pointer isn't visible in the screen caps above, but it's hovering over Continent on both cases. The one on the right shows the comment "What, no Antartica?", the one on the left has no comment.

Our goal is to have the comment, and all the fields' comments from the commented connection applied to them in the other workbook. Here's the procedure:

  1. Rename or copy the commented workbook to "TableauReference.twb".
  2. Put it in the directory where the uncommented workbook(s) are.
  3. Have Ruby installed (directions below).
  4. Have the Ruby script "ApplyReferenceComments.rb" below available—it doesn't really matter where, but it's more convenient to have it in the same directory.
  5. Run "ApplyReferenceComments.rb".
  6. Watch the magic happen.
The Ruby Code

ApplyReferenceComments.rb is inline below and can be copied from there, or you can download it from here.


# ApplyReferenceComments.rb - this Ruby script Copyright 2014,5 Christopher Gerrard require 'nokogiri' require 'Set' $csvFileName = 'AppliedFieldComments.csv' $csvFile = File.open($csvFileName,'w') $csvFile.puts 'Workbook,Data Source,Field Caption,Field Name,Field Title,Comment' unless $csvFile.nil? $CSVHeader = <<CSVHEADER.gsub("\n",'') Workbook, Data Source, Field Caption, Field Name, Field Title, capComment.nil?, nameComment.nil?, refComment.nil? CSVHEADER $activityCSV = File.open('activity.csv','w') $activityCSV.puts $CSVHeader unless $activityCSV.nil? $errorFile = File.open('ReferenceComments.err','w') $referenceTWB = 'TableauReference.twb' $refFieldComments = {} $referenceFields = File.open('ReferenceFields.csv','w') $referenceFields.puts('Data Source,Field #,Name Type,Name,Comment') def docUsage(arg1="",arg2="") puts arg1 puts " USAGE:" puts " " puts " run: ruby ApplyReferenceComments.rb -go {twbFileSpec}" puts " twbFile.twb - explicit file name" puts " 'f*.twb' - name with wild card(s)" puts " " puts " help: ruby ApplyReferenceComments.rb -help " puts " " puts " tech: ruby ApplyReferenceComments.rb -tech " puts arg2 end def docHeader(arg1="",arg2="") puts arg1 puts " #{$0}" puts " " puts " - Applying Field Comments to Tableau Workbooks, using '#{$referenceTWB}' " puts " as the comments\' reference source." puts arg2 end def docNotes(arg1="",arg2="") puts arg1 puts " help information:" puts " " puts " - Ruby must be in your PATH, or executed as {path}\\ruby.exe" puts " " puts " - This script must be in this dir, or executed as {path}\\#{$0}" puts " " puts " - Identifying TWBs for commenting." puts " The TWBs to be processed may be declared on the command line after \'-go\'" puts " " puts " Default - all the TWBs will be processed. e.g. -go " puts " (those not already commented)" puts " A single TWB - can be identified by name, e.g. -go BigData.twb" puts " the name can be quoted or not." puts " Multiple TWBs - can be identified via wild cards, e.g. -go 'Big*.twb'" puts " this must be quoted, if not only " puts " the first matching TWB counts." puts " " puts " - Commented TWBs" puts " TWBs that have reference comments added to any of their fields will be " puts " saved with '.comment' added to their names, e.g.: " puts " " puts " BigData.twb => BigData.commented.twb " puts " YabbaDabbaDo.twb => YabbaDabbaDo.commented.twb " puts " " puts " This ensures that no TWBs will be harmed during this process." puts arg end def docTech(arg1="",arg2="") puts arg1 puts " technical information:" puts " " puts " - Identifying Reference Comments " puts " " puts " - '#{$referenceTWB}'" puts " must be present to serve as the Fields\' comments source." puts " The reference fields' comments will be compiled from the fields found" puts " in '#{$referenceTWB}' into a list of fields and their comments." puts " Any problems locating or loading comments from '#{$referenceTWB}'" puts " will raise an error message, and no TWB processing will occur." puts " " puts " '#{$referenceTWB}' Data Connections" puts " - In many cases there will be a single canonical table/view used as" puts " the source of reference comments." puts " - However, it's possible that '#{$referenceTWB}' may contain" puts " more than one data connection, in which case the connections will be" puts " processed in the order in which they occur in '#{$referenceTWB}'," puts " and this order is unpredictable." puts " - In the case of the same field occurring in more than one reference" puts " Data Connection, the last one wins, i.e. the last field identified" puts " during the compilation process provides the reference comment." puts " " puts " - Applying Reference Comments" puts " - Fields in the TWBs that match one of the reference fields will be provided" puts " with the reference field's comment." puts " - Existing field comments will be discarded." puts " " puts " - Field matching occurs using either of the field's names:" puts " - Technical - the field name in the database, or " puts " - Presentation - the 'Tableau' field name, i.e. the field has been renamed. " puts " " puts " Fields that reference the same database field may not have the same Tableau" puts " name in different workbooks. This process considers that matching either the " puts " Technicalor Presentation name identifies a candidate field as matching and " puts " it will then be commented with the reference comment." puts " " puts " The candidate and reference fields' data types need not match, this is deliberate. " puts " " puts " - Commented fields will be recorded in #{$csvFileName}, which can " puts " be examined (with Tableau, naturally) to see which fields have been commented." puts arg2 end def docCaution(arg1="",arg2="") puts arg1 puts " USE WITH CAUTION" puts " ---------------------------------------------------------------------------------" puts " This process will add comments to all Db fields in the local TWBs that match the " puts " reference fields." puts " " puts " This is a destructive process - any existing comments will be discarded and cannot" puts " be recovered." puts " " puts " Run this process on copies of your workbooks, and check before and after running" puts " this process to make sure that everything is as it should be." puts arg2 end # IMPORTANT # Last comment wins - if more than one reference field exists with the same name # the last one encountered will be used as the reference field def loadReferenceComments refTwbExists = File.file?($referenceTWB) # puts "Reference twb Exists? :: #{refTwbExists}" return false unless refTwbExists doc = Nokogiri::XML(open($referenceTWB)) dataSources = doc.xpath("//workbook/datasources/datasource").to_a dataSources.each do |ds| dsName = ds.xpath('./@name').text dsCap = ds.xpath('./@caption').text dsName = if dsCap == '' then dsName else dsCap end fieldNodes = ds.xpath('./column').to_a fieldNodes.each do |col| colCap = col.attr('caption') colName = col.attr('name') colTitle = if colCap == '' or colCap.nil? then colName else colCap end comment = col.at_xpath('./desc') recordReference(dsName,colCap,colName,comment) unless comment.nil? end end puts " - Reference fields loaded.\n\n" return refTwbExists end $refFieldNum = 0 def recordReference(dsName,colCap,colName,comment) return if comment.nil? # -- $refFieldComments[colName.upcase] = comment unless colName.nil? or colName == '' $refFieldComments[colCap] = comment unless colCap.nil? or colCap == '' # -- text = comment.text.gsub(/[ ]*\n[ ]*/, ' ').gsub(',', ' ').gsub(/\t/, ' ') $refFieldNum += 1 $referenceFields.puts("#{dsName},#{$refFieldNum},Caption,#{colCap},#{text}") $referenceFields.puts("#{dsName},#{$refFieldNum},Name,#{colName},#{text}") $referenceFields.puts("#{dsName},#{$refFieldNum},Name - UC,#{colName.upcase},#{text}") end def addComment(colNode, commentNode) # Tableau only recognizes the first node as a comment so this # method removes any/all existing nodes before adding the new one descNodes = colNode.xpath('desc') descNodes.each { |dNode| dNode.remove } colNode << commentNode return true end def writeTWB(doc, twb) commentedTWB = twb.sub(/.twb$/,'.commented.twb') newTwb = File.open("#{commentedTWB}",'w') newTwb.puts(doc) newTwb.close puts " #{commentedTWB}" end # These are the standard data Tableau column attributes for the Teradata # types identifed by inspection in the TWB metadata-record elements. # This script is written to handle Teradata sources, may need adjustment # for other data sources. $mdRecProps = { 'date' => {'datatype' => 'real' , 'role' => 'measure' , 'type' => 'quantitative' }, 'integer' => {'datatype' => 'integer', 'role' => 'measure' , 'type' => 'quantitative' }, 'real' => {'datatype' => 'real' , 'role' => 'measure' , 'type' => 'quantitative' }, 'string' => {'datatype' => 'string' , 'role' => 'dimension', 'type' => 'nominal' } } def getMetadataRecordFields(twb, doc, dsName, dataSource, twbFields) mdFields = dataSource.xpath('./connection/metadata-records/metadata-record[@class="column"]') mdFields.each do |f| localName = f.at_xpath('local-name').text seen = twbFields.include?(localName) if !seen then refComment = $refFieldComments[localName.upcase] # - get a reference comment for the column name, if any if !refComment.nil? then type = f.at_xpath('local-type').text attrs = $mdRecProps[type] if !attrs.nil? then colNode = Nokogiri::XML::Node.new('column', doc) colNode.set_attribute('name',localName) attrs.each do |attr, value| colNode.set_attribute(attr, value) end colNode << refComment dataSource << colNode else $errorFile.print "ERROR getMetadataRecordFields - no attributes found for metadata-record:" $errorFile.puts "'#{localName}' in dataSource:'#{dsName}' in TWB:'#{twb}'" end end end # puts end end # ====================== /resolving metadata, non-column fields ====================================== def processTWB twb twbFields = Set.new # use to keep track of fields found, by their technical/DB names return if twb =~ /.*[.]commented.twb$/ # don't process already processed TWBs puts "\n - #{twb}" doc = Nokogiri::XML(open(twb)) dataSources = doc.xpath("//workbook/datasources/datasource").to_a # puts " - loaded" isCommented = false dataSources.each do |ds| dsName = ds.xpath('./@name').text dsCap = ds.xpath('./@caption').text dsName = if dsCap == '' then dsName else dsCap end # puts " - DS: #{dsName}" fieldNodes = ds.xpath('./column').to_a fieldNodes.each do |col| colCap = col.attr('caption') colName = col.attr('name') colTitle = if colCap == '' or colCap.nil? then colName else colCap end # puts " - col: #{colName}" twbFields.add(colName) # register seen Columns for ref during handling capComment = $refFieldComments[colCap] nameComment = $refFieldComments[colName.upcase] refComment = !capComment.nil? ? capComment : nameComment $activityCSV.print "#{twb},#{dsName},#{colCap},#{colName},#{colTitle}" $activityCSV.puts ",#{capComment.nil?},#{nameComment.nil?},#{refComment.nil?}" unless refComment.nil? then # puts " - comm:\t #{colName} " isCommented = addComment(col,refComment) if isCommented then twbFields.add colName title = colTitle.sub(/^\[/,'').sub(/\]$/,'') comment = refComment.text.gsub(/[ ]*\n[ ]*/, ' ').gsub(',', ' ').gsub(/\t/, ' ') $csvFile.puts "\"#{twb}\",\"#{dsName}\",#{colCap},#{colName},\"#{title}\",\"#{comment}\"" end end end getMetadataRecordFields(twb, doc, dsName, ds, twbFields) end # puts "\n\n====" # twbFields.each { |f| puts f } writeTWB(doc, twb) if isCommented end def process(arg1='*.twb') docHeader "\n\n\n" if loadReferenceComments then puts "\n\n Processing TWBs named '#{arg1}'...\n" Dir.glob(arg1) { |twb| processTWB twb unless twb == $referenceTWB } $csvFile.close unless $csvFile.nil? puts "\n\n Done.\n\n\n" puts "\n That's all, Folks.\n\n" else puts " ERROR" puts " ---------------------------------------------------------------\n" puts " Either:" puts " - The Comments Reference TWB: #{$referenceTWB} is not present." puts " or " puts " - A problem occurred loading the reference comments from it." end end def usage docHeader "\n\n\n" docUsage "\n\n\n", "\n\n\n" end def help docHeader "\n\n" docUsage "\n" docNotes "\n\n" docCaution "\n\n", "\n\n\n" end def tech docHeader "\n\n" docUsage "\n" docNotes "\n\n" docTech "\n" docCaution "\n", "\n\n\n" end system "cls" # what to do? case ARGV[0] when /[-]+[g][o]?$/i ARGV[1].nil? ? process : process(ARGV[1]) when /[-]+(h|help)$/i help when /[-]+(t|tech)$/i tech else usage end #$csvFile.close unless $csvFile.nil?

Running ApplyReferenceComments.rb

Notice that running "ApplyReferenceComments.rb" requires using "-go" on the command line. Not providing "-go" will cause "ApplyReferenceComments.rb" to provide usage information, including how to see help and technical information. These are shown below.

Commenting in action.

Note that whenever a Workbook is found ApplyReferenceComments.rb lists it, and then if there are any comments added to it the original Workbook is saved with ".commented" appended to its name. This ensures that the original TWB is left unchanged.

Recording comments applied.

A record of the comments that are applied is kept as data in the CSV file "AppliedFieldComments.csv". Here's an image of a Tableau workbook showing the results of the scenario above.

There's only one Workbook and one Data Source, so they're shown in the sheet's title do the sheet can be narrow enough to display the whole worksheet.

The usual caveats apply.

This script has worked well for my situation and needs.
It may or may not work as well for you in yours.
There are no guarantees that it will do anything useful.

Tableau 9

Rumor has it that Tableau has told one Tableau tools vendor that Workbook XML will not change in any ways to break existing tools and utilities. This would be good, there's been no public commitment to this that I know of and Tableau 9 could break everything.

Evolution.

These utilities evolve fairly rapidly as needs and conditions change. I envourage you to take this code and use it as the basis of your work, and hope that if you do you'll leave a comment and a link to your new stuff. The more tools the community creates the better for all of us.

Comments welcome.

I'm very interested in hearing about any experiences with this, good and bad. Please leave a comment if you give it a try.

Saturday, February 7, 2015

Tableau Server - Factors Affecting What Users Can Do

Tableau Server Permissions continue to confuse people.

I encountered this discussion in the Tableau Community recently: Reporting on Permissions for Server Projects in which the participants are trying to figure out how to create custom admin views to see who can view which Projects. Like many of these discussions it ranges out from its initial focus to contemplate broader Permissions questions.

Rather than trying to cover everything to do with permissions in this post, mostly because it's really complex, convoluted and difficult, almost impossible to come up with a succinct description, I'm going to use this opportunity to post this link to a diagram mapping the Tableau Server landscape identifying the elements that affect who can see what and what they can do with what they can see.

The information is accurate and complete, as far as I've been able to get it, with the notable exception that it doesn't include Web Edit capabilities. If and when anyone finds any deficiencies—inaccuracies, missing elements, something that's not clear, etc., please leave a comment and I'll correct the map.

Things to be aware of.

The following address some of the points in the discussion noted above, along with others worth noting. I'll keep adding to them as time goes by, depending upon time, best intentions, inclination, other interesting things coming up, etc. I'll also incorporate useful points provided in the post's comments, along with attribution, so if you're add anything and want credit please identify yourself (anonymous doesn't get nearly the credit he/she deserves).

Tableau's approach to Permissions is different from other approaches. And not in a good way.

Although Tableau's approach to providing ways to control access to and interaction with objects and contents uses terms that are familiar, it's very much its own animal.
The surface similarities lend an aura of familiarity that leads to expectations of how it's going to work, and consequently approaches to working with and managing objects, content, and the Users' interactions, that aren't in line with Tableau's reality. And this causes very substantial cognitive conflicts that can be difficult to recognize, much less overcome.

It's not you.

Learning about Tableau's Permissions and how they work is hard.
Internalizing this knowledge is harder still.

Don't be worried or discouraged if working with Permissions isn't smooth and easy out of the gate. For most of us it's a struggle to absorb things to the point of being reasonably comfortable with predicting how things will work and puzzling out why they're not working the way we think they should.
Don't give up. You still have us.

Permissions are attached to Tableau Server objects, not users or groups.

Permissions are attached to Projects, Workbooks, Views, and Data Sources.
Permissions connect Users and Groups to these content objects.
There's a tendency to think of Permissions being assigned to Users or Groups, this can lead to difficulty in interpreting how Permissions work and figuring out why things aren't working the way one thinks they should.

There's more to it than Permissions. Lots more.

Permissions aren't the only thing controlling what users can see and do.
License levels, admin status, publishing right granted or not, content ownership, site membership, group membership, individual identity, and whether a workbook is published with tabs showing all get involved.

Not Allowed is Denied

Any Permission that is not explicitly granted to a user somewhere, somehow is effectively denied.

"Inherited" means "Not Configured Here"

Inherited is a horribly bad word as it's used in the documentation and Tableau Server UI. It implies that there is an active process that assigns something. While there is, in a very narrow technical sense something that -could- be called inheritance in Tableau Server it's not what people who don't already know it think of inheritance being, and in this it misleads almost everyone who needs to know what's going on (and this sentence is far less confusing than inheritance).

Permissions get assigned to objects when they're created or published.

There's always a default Permissions set that is available to be assigned to content when it's created—Projects, or Published—Workbooks/Worksheets and Data Sources.
It's possible to override the default Permissions when publishing Workbooks and Data Sources.

Permissions can be changed on anything that carries them.

Anyone who has access to the object and is permitted to do so can change the object's Permissions. This is an extremely powerful ability and makes effectively administering Tableau Server possible.
However, it takes experience and skill to manage Permissions well—it's really easy to make a real hash of things, so be careful and test afterwards.

Changing Permissions on something DOES NOT change the Permissions on the things it contains.

Changing Project Permissions does not change the Permissions on the Workbooks and Data Sources in it.
Changing Permissions on a Workbook does not change the Permissions set on the published views in it (if any have been set, and this is a topic all on its own).

Admins can do anything.

No real surprise here, admins can do anything with anything, so it's best to make sure your admins understand the consequences of what they're doing. Most don't.

Owners can do anything with their content.

Whomever published something is its owner by default, and can do whatever they want with it. Subject to the permissions of other objects that may be involved, an Owner can't move a Workbook into a Project for which they don't have the proper Permissions.
  (exercise for the reader: what Permission lets someone publish or move a Workbook into a Project?)
Ownership is a frequently overlook property. The owner of a thing can change its Permissions, which can lead to confusion.
Ownership can be transferred, and this can really throw spanners into the works.

Puzzling out why someone can or cannot see or do something can get extremely complicated and vexing.

The number of things that can be involved is large, and their interactions can be really hard to untangle. There's a process I go through, but it's not really a recipe I can write down, more like an exploration of a jungle when there's lots of potential dangers lurking in the understory waiting to trip you up.
OK, it's not that mysterious - I essentially work through the map identifying what's involved until I get to the solution point.

Roles are not very useful, and can be profitably ignored.

Roles are nothing more than preconfigured sets of Permissions that can be attached en masse.
They're useful in the very narrow circumstances where there's a Role that matches the exact set of Permissions that suit your needs. In my experience with quite a few clients this is almost never the case.
Any set of Permissions is a Role, which would be OK if there was some way to create and manage custom sets of Permissions and give them meaningful names, but there isn't. Any set of Permissions that isn't one of the standard Roles is named "Custom", which makes it impossible to identify what the Permissions are without digging down and looking at them manually, which imposes a needless burden on whomever is trying to figure this out.

The ability to set Permissions is a Permission.

Anyone who can set Permissions on an object can grant other people or groups the ability to also set Permissions on that object. In general this is a bad idea and should be discouraged.

Tableau 9 could change everything.

From what I've seen so far Tableau Server 9's Permissions monitoring and management abilities and clarity and ease of use has improved leaps and bounds. It's very different from previous approaches.
I haven't really had the chance to dig into it yet, but will and hope to be delighted.

Wednesday, February 4, 2015

On Analysis of Post-Relational Data. #1 - Master-Detail Hierarchy

image/svg+xml Name Manager Budget First Name Last Name * one and one and one Manager Budget Employees, and some Name Department Department Employee one and one and one Hired Date Salary First Name Hire Date Salary one Second Name Each has Department Digging Slate 147 HQ Granite 49 Name Manager Budget First Name Last Name Employee Hire Date Salary Department Digging Barney Rubble 02/02/02 16 Digging Fred Flintstone 01/01/01 17 HQ Rock Quarry 11/11/01 19 First Name Last Name Hire Date Salary Department Digging Barney Rubble 02/02/02 16 Digging Fred Flintstone 01/01/01 17 HQ Rock Quarry 11/11/01 19 Department - Employee <joined> Digging Slate 147 HQ Granite 49 Name Manager Budget Digging Slate 147 <company> <Department> <Name>HQ</Name> <Manager>Granite</Manager> <Budget>123</Budget> <Employee> <FirstName>Rock</FirstName> <LastName>Quarry</LastName> <HireDate>11/11/01</HireDate> <Salary>19</Salary> </Employee> </Department> <Department> <Name>Digging</Name> <Manager>Slate</Manager> <Budget>147</Budget> <Employee> <FirstName>Fred</FirstName> <LastName>Flintstone</LastName> <HireDate>01/01/1001</HireDate> <Salary>17</Salary> </Employee> <Employee> <FirstName>Barney</FirstName> <LastName>Rubble</LastName> <HireDate>02/02/02</HireDate> <Salary>16</Salary> </Employee> </Department></company> { "company": { "Department": [ { "Name": "HQ", "Manager": "Granite", "Budget": "123", "Employee": { "FirstName": "Rock", "LastName": "Quarry", "HireDate": "11/11/01", "Salary": "19" } }, { "Name": "Digging", "Manager": "Slate", "Budget": "147", "Employee": [ { "FirstName": "Fred", "LastName": "Flintstone", "HireDate": "01/01/1001", "Salary": "17" }, { "FirstName": "Barney", "LastName": "Rubble", "HireDate": "02/02/02", "Salary": "16" } ] } ] }} Name Manager Budget one one and one Department Each has each of which has Employees First Name Last Name Hire Date Salary one one and one one Department Each also has some some atomic attributes: Consider an ordinary company. It has a standard (simple) organizational structure, where employees work in Departments managed by managers. Straightforward enough.Here's a simple diagram showing the information structure: Here's a normal tabular representation of the company's Departments and Employee data.This is how data has been commonly represented for the past quarter century. Here's the tablular form of the data joined by Department Name.There are a number of ways to accomplish this, but they end up resolving to same single table that Tableau sees, or creates via data blending. Tableau is designed to access and analyze tabular data.It has no trouble connecting to these individual tables, and its intrinsic default aggregation scheme will return the values that mean what people think they mean, e.g.- showing the SUM(Budget) is correct whether it's for one or both Departments or Managers;- summing salary will similarly work for the different Employee dimensional contexts. This is good, but people frequently want to analyse the data for the whole organization, to see things like the sums of Salaries for Departments or Managers. In order to do this, the tables need to be related, in this casebased upon the Department Name. While this is technically correct, in the decomposed relational database view of the world it creates real and substantial analytical difficulties that manifest themselves as technical considerations the human analyst needsto be aware of and accommodate while analyzing the data or else risk achieving results that are arithmetically and technically correct in a narrow sense, but invisibly wrong in the human sense.The obvious case using Tableau, is: if the user puts Manager on the Rows shelf and Budget and Salary on the Colums shelf, Tableau will generate this: Clearly, this is a nonsensical result.The value '294' doesn't make sense for the Digging Department. Department 0 100 200 300 Budget 0 10 20 30 Salary Digging HQ 294 49 19 33 Sum Budget, Salary per Department 294 is shown as Digging's Budget sum because Tableau's default aggregation is SUM and 294 -is- the sum of the Budget values in the two records in the record set Tableau's looking at.In Tableau's world, the analyst is responsible for knowing the technical details of the data's organizationincluding the differing table granularities, how they're joined or blended, and the analytical context in order tochoose the right aggregation to achieve sensible results.Here's how Tableau can be coaxed into showing the correct analytic: Note: MIN(Budget) Chaining the Budget aggregation to MIN(), or any of the other aggregations that result in the identity value works.But it's a poor solution.Tableau should be smart enough to recognize the hierarchical relationship between Departments and Employees and implement the aggregation correctly according to the analytical context.It might seem like this is a bit of a pipe dream, that it's asking a lot, perhaps too much, of Tableau. That it's outside of the world of "real" data that Tableau was designed to work with.Except that it's not quite right.Hierarchical data was the normal way of storing data before everything was 'relationalized' and decomposed intonormalized tables that required recomposition before data with real world semantics could be available for analysis.And here's the best part: the problem of how to interpret an analytical request like "sum the Budget and Salary per Department" correctly in different analytical contexts was solved over thirty years ago, and formed the basis for entire families of data analytical tools. Modeling the Company's data to match the human view. This is an abstraction of the technique of using a structural description (a master file descriptor (or MFD)- and a prize to the first commentor to identify the reference) to identify the data contents and relationships. * Department Employee Name Manager Budget First Name Last Name Hire Date Salary Digging Slate 147 Fred Flintstone 01/01/1001 17 Barney Rubble 02/02/02 16 HQ Granite 49 Rock Quarry 11/11/01 19 MFD, schema, etc. Data Modern Data XML JSON This is how it used to be done. The structural description was in a human- and machine-readablefile and the data could be, if desired, kept in an optimzed dedicated file type designed for rapid access to its content. Really handy for retrieval during analysis.In this scheme, this is a legitmate analytical request: from DepartmentData sum Budget, Salaryand this would be the result: I've proposed this idea, that Tableau needs to recognize and accommodate sructural data relationshipsand sometimes hear something like: "well, that's all very nice and all, but modern data is stored in tables, and that's just the way things are".I don't think so.Here are a couple of examples of the Bedrock Mining Company data stored in modern forms. XML and JSON are excellent examples of approaches that are moving away from relational/tablular data modelingand storage in order to accommodate modern needs. The noSQL movement is increasing its presence in mainstream business data environemnts. Tableau technology partners aren't limited to the traditional relationaldatabase vendors; one of the major enterprise noSQL advertises that they store data internally as XML, eventhough the only visibilty to it through Tableau is via slices that resolve into single tables.It's worth noting that JSON is at the heart of the aggressive evolution occuring in data visualization, particularly theweb-based javascript approaches that are capable of creating exquisite data visualizations, albeit with a modicumof programming effort, and the increasing sophistication of the technological abstractions are leading to thedevelopment ot end-used based tools that will handle complex data easily, simply, and with very little friction. Beyond master-detail, the future beckons. The example above is the first step beyond flat tables. A two-level master-detail structure will enable an entire new world of analytical opportunities and abilities. But it's just a baby step.There are other horizons to reach, a whole new universe to explore.Beyond simple hierarchies are multi-level, multi-path hierarchies. Imagine if each Bedrock Mining Company department had, in addition to employees, vehicles and buildings they were responsible. It's easy to model this in XML, JSON, or any number of other data formats, but it's not easy to analyze the data, even though it might be interesting to see if there was any differential brake wear, comparing those departments with employees hired on or before the median date of hire fof all employees. When Tableau makes this a simple UI operation it'll really be on to something.But wait, there's more.Suppose you had an XML file containing data relating to a wide variety of different things, with many hierarchical levels, many paths, and this is where it really gets interesting, non-parallet semantic relationships between elements and cross-linking across paths with references in one location identifying elements in a different structural path and at different levels. If only Tableau had the ability to allow access to something like this we could really work some magic. I've been waiting for this for over five years, no telling how much longer to go.Admittedly, these aren't trivial matters. There are difficult problems to solve, technically and human/usability-wise.Someone's going to solve them and create the next great tool that makes it straightforward to do. And that will be good. DepartmentDiggingHQ Budget14749 Salary3319 DepartmentDiggingHQ Budget Salary 3319 14749 - rendered as Table - rendered Graphically This demonstration only shows the structural properties, the decorative elements are left out deliberately. each of which has