Tuesday, March 31, 2015

Tableau and the Golden Tool Rule

The Golden Tool Rule

A golden tool is one that makes doing something useful simple, straightforward, and easy.

Golden tools are delights to use. Wielding one establishes a connection with the material one's working with, whether it's a piece of wood being shaped for a custom cabinet, vegetables being diced for the stew pot, or data that's being tossed and turned, flipped, filtered and pulled together into representations that make sense.

Tableau is, at its core, a golden tool. It makes the basic data analytical operations simple, straightforward, and easy. Connecting to data is as simple as dragging and dropping a data-containing file onto or into it. Want to see what's in this dimension? Double-click on it. Interested in the sum of Profit? Double-click on it and Tableau shows it to you in the most effective visual form for the current context. Clicking, double-clicking, and dragging and dropping the data fields (and other objects) causes wonderful things to happen — Tableau understands these actions, this language of visual data analysis, and knows how to present the things you've asked it to in ways that make sense.

Original gold.

Years ago I spent almost a decade working for the company that invented the business data analytical software product category. With FOCUS–our company's product, it was possible to express the basic data analytical operations in a clear, concise, human-oriented language that anyone could pick up and get started with. FOCUS was a golden tool, in its core and time, and its ability to help forge a close connection with their data made many, many people much happier, and vastly more productive than they'd been before.

In their bones...

Tableau and FOCUS are strongly analogous. Each was a quantum step forward from the other tools of its time, making the basic data analytical operations simple, straightforward, and easy. Tableau did this by providing a visual syntax and grammar oriented around organizing data elements as first order user interface objects that represent the data and the analytical operations. FOCUS accomplished this by providing a simple language that used English words to implement the same structure and operations.

To illustrate the heart of Tableau and FOCUS, we'll assume that Snow White's Dwarf friends have been keeping track of the number of gems they've mined, and that each of the seven works in a Department.

We want to know a simple, basic thing: how many gems in total were mined by the Dwarves working in each Department?

Finding the sum of Gems mined per Department with Tableau,
and with FOCUS.

Creating the analytic above is a simple two-action process:

  1. Move "Dept" to "Rows"
    by dragging it as shown, double-clicking it in the data window, or dragging it to the left-side zone of an empty sheet.
  2. Add "Gems" to the viz
    by double-clicking it in the data window, dragging it to the center zone of an empty sheet, or dragging it to the Text button on the Marks car.
Order of Operations Matters.

One of the things that confuses people new to Tableau is that it doing things in a different order gets different results.

For example, the illustration above shows Dept being put on the Rows shelf first, followed by Gems being added to the viz. If the order is changed, with Gems first and Dept second, the visualization will be different. It's left to you, dear reader, to give it a go for yourself and see what happens – this little, seemingly innocuous exercise reveals one of the subtle, deep mysteries of Tableau. Understanding it unlocks many Tableau doors.

This bit of FOCUS:

 
      TABLE FILE Dwarves
        SUM Gems
        BY Dept
      END
 

provided at the interactive prompt, or run from a file, creates this report:


      Dept    Gems
      ======= ====
      Medical   34
      Mgmt      35
      Ops      518

Almost seems too simple, doesn't it?

Eight simple, straightforward words to generate the analysis. Even better, the two analytical statements need not be in any particular order:

 
      TABLE FILE Dwarves
        BY Dept
        SUM Gems
      END
 

Swapping the order of the BY and SUM statements makes no difference.

FOCUS was in this sense non-procedural, making it even easier to get results fast because people didn't need to know what order to put them in.



The first gold rush.

It's very difficult for those who weren't there to understand how golden FOCUS was in its day.

Introduced in 1975, FOCUS was astonishing. Instead of waiting for their COBOL developers to program and deliver reports people could now use a simple English language to get reports from their business data immediately. As a FOCUS consultant I was most often able meet with my clients in the morning and have reports ready for review that afternoon.

FOCUS was the most successful 4GL, the premier product for conducting business data analysis, used by organizations across the globe to help get information out of data into the minds of the people who needed it.

The ability to access and analyze information with minimal intervention by or support from an organization's data processing group (IT's predecessor) changed the world. Business people could make decisions based on their data rather just relying on intuition. FOCUS was used across industries and in the public sector. Life was good. As a FOCUS consultant, and a product manager, with Information Builders, Inc. (IBI)–FOCUS' vendor–I was able to help make a material difference in our clients' use of their data. In the mid-1980s IBI was one of the world's largest independent software companies, with revenues in the hundreds of millions of dollars, many. many loyal customers, and legions of devoted FOCUS users.

And then things changed.

FOCUS was conceived in the mainframe world. It thrived in that world, where CRTs and line printers were the human-computer interface, where hierarchical databases were common. Its beauty and grace were of that world. But the world of business computing changed, evolved into a world where FOCUS' mainframe roots were out of step with the emerging models of how people interacted with their computers.

Different models of human-computer interaction emerged, replacing the terminal, character-based, block mode mainframe interaction where applications drove the conversation. Minicomputers introduced per-character interaction, allowing finer granularity; every keystroke the user typed could be examined as it was typed. Micro and personal computers took this further, inverting the Human-Computer relationship, allowing for different application models. Then GUIs showed up, providing entire new possibility horizons for creating software tools that support the person who's trying to accomplish their work.

The world was full of promise from the mid-80s into the 90s. There was a vibrant environment of innovation within which clever people were trying to figure out how best to take advantage of the new ways of computing to build the next generation of golden tools. GUI PC applications were becoming well established. Business applications were evolving at a rapid pace, notably word processors and spreadsheets. At IBI we were working across all the platforms – PCs, Unix, VAX, even mainframes, on technologies and designs to create the next-generation tools that would surface the simplicity, elegance, and expressiveness of FOCUS' data analysis language using the modern Human-Computer interfaces. During this period I worked first in the Micro Product division, then in the Unix division, and with others across the divisions to create great new tools. In the Unix division we created an object oriented programming language and platform and used it build a new GUI-based, network-aware FOCUS that surfaced the basic data analytical operations as top-level UI elements. Other divisions in IBI were working on similar projects, each group creating new and wonderful stuff. At the same time other companies were working on and releasing post-mainframe reporting software products.

In the early 90s the decision was made to shut down the different divisions' projects and adopt the Microsoft Windows-based approach that eventually became WebFOCUS. It was sad. An entire generation of people left the company.

Meanwhile, things were happening, forces were marshaling, that led to the near-extinction of simple, straightforward data analysis as a viable ambition.

The business data analysis dark age descended.

For many years things were bad. The environments had changed—the reasons for it are many, and beyond the scope of this post. We, who prided ourselves on our ability to access and understand data, and to help our clients and customers do the same, had to watch helplessly as the giants ran amok, vying with one another to create ever-larger and more monstrous mountains, continents even, of data with little regard for the need to actually understand it at all scales. Consolidation before analysis in the pursuit of the mythical single version of the truth as the fulcrum about which all business data analysis pivoted became the unquestioned paradigm. Billions of dollars were spent and wasted with little to show for it. Life wasn't good. Unless you were profiting from Big BI platform sales and implementation consulting dollars.

And then, an opportunity.

In 2006 I was working for a major city government building web applications (hey, one needs to eat) when I was asked to review the ongoing citywide data warehouse project. It had been going on for a long time, eaten through tons 'o money, and had exactly two users, both of who were part of the group creating it.

Seemed simple enough: all I needed to do was understand the data as it entered and moved through the system. there were many data feeds being slurped up into staging databases; there were ODSs, ETL processes inputs and output, an EDW, and a Big BI platform, with some ancillary bits. And nobody knew, or could provide, any real transparency into what was going on. It was an almost perfect situation. All I needed was a way to access and understand the data. All the data, wherever it lived.

But how? I needed a good data analysis tool, one that could do the job with a minimum of fuss and bother, that would get let me do my work and stay out of the way. I wanted–needed–a tool with FOCUS' analytical simplicity and elegance, but in a modern form, ideally an interactive UI with direct-action support for basic data analytical operations.

So I started surveying the landscape, looking for something that would fill the bill. I tried out all the tools I could find. The most promising of the bunch were, in alphabetic order: Advizor, QlikView, Spotfire, and Tableau. They all had their strengths; each of them was an excellent tool within its design space. But which of them were any of them created for the purpose I needed – making it as simple as possible to access and analyze the data I needed to understand? Anything extraneous to this, any extra 'benefits', were of no interest to me, and in fact violated the GTPD (Golden Tool Primary Directive): anything that doesn't provide direct benefit to the specific, immediate need is not only of no value, it's a drag and detriment to getting the real job done. (and it's amazing how may BI technology companies have failed to, and continue to, recognize this simple truth - but that's a topic for other times)

Eureka! A nugget!

Only one of the tools was designed specifically to make data analysis a simple, straightforward non-technical activity that was approachable, welcoming, and truly easy to use. Tableau was far and away the best tool for the job I needed to do. And in this space it's still the best tool that's come to market.

I love Tableau for the bright light it brought to a dim, drab world. Right out of the box I could see and understand data. What's in this field? How many records are there? How much of this is in here? What's the relationship between this and that? How many permits of each kind of permit have been issued? (it was a city's operational data, remember) It was a great, great, thing then, and for these purposes it remains a great product.

The second gold rush.

For the first few years Tableau was my personal tool, one I used in my work, for my own purposes. For a time I had a steady stream of work rescuing traditional Big BI projects that had gone off the rails by using Tableau to help bring clarity to the entire endeavor. Instead of relying on technical people using SQL query tools to try to make sense out of tables of data Tableau let us see the data as the business information it really was, improving the quality and velocity of the work.

It took a few years for it to catch on—people are naturally conservative, particularly those with a vested interest who feel threatened. But as Tableau became used by more and more people it helped them individually, and it demonstrated that there really is a market, a demand, for highly effective tools that let people understand that data to matters to them wit a minimum of fuss.

Life was good again.

Tableau, and the people who created, supported, championed, and used it to good effect richly deserves the credit for the good done. Now the door is open, the horizons are expanded so far they're almost out of sight.

But...

Tableau is a golden nugget, a shiny, impressive nugget. Which, to stretch the metaphor, was invaluable when there wasn't any other gold to be had.

But it's only a nugget.

I've mentioned Tableau's core. This is the area where Tableau got it right: providing fundamentally direct and easy to use mechanisms implementing the basic data analytical operations. In this space there's not much room between how good Tableau is and how good it's possible to be. So, what are these basic operations? Simple, they are the things one does to organize, sort, filter, and aggregate data so that it can be observed and assessed in order to understand it. They are, briefly:

  • Choosing which fields to see – e.g. Profit, Region, and Department
  • Organizing the fields – e.g. Profit for the individual Departments by Region
  • Filtering, choosing which data to see, – e.g. only the West and South Regions; only records with Profit > 12
  • Deciding which aggregation to use – Tableau assumes SUM() as the default

In this, the basic data analytical space, that formed the great majority of the product when it was introduced, and when I started using it, Tableau is golden; it makes doing these things about as simple and easy as it can be, and on top of that it provides high quality visualizations of the aggregated values in context, both in the type and rendering. Gold doesn't tarnish, and Tableau's luster here hasn't faded.

But this space isn't the whole of it. There's a lot more to the totality of data analysis than the initial data analytical space, and beyond the initial space there are many places and ways in which Tableau isn't as good as it could be. This blog contains some of the areas where Tableau falls short, there are many, many more that I encounter every day. Some of them are just annoying, like the horrible formatting system. Some are application architecture aspects, like the layout and organization of the workspace where the data, dashboard, and formatting panes all share the same space, making for a lot time- and energy-wasting opening and closing. Others are structural, like the leveraging of reference lines to implement pseudo-bullet graphs, which are crude and cartoonish compared to what they could be. The list is very long, and Tableau doesn't seem to be spending any energy fixing what should be better.

Viewed broadly, Tableau is a golden nugget embedded in a matrix of cruft and bolted-on, awkward, barnacled machinery that gets much more in one's way than out of it. Worse yet for being largely undocumented—but for the immensely impressive work of people in the Tableau community who've spend vast amounts of time chipping away at it we'd be largely lost in an impenetrable forest.

He's not handsome, but he sure can hunt.

You may at this point be thinking: why on earth is this guy still using Tableau, if he's so unhappy with it?

I'm glad you asked. It's because, as much as I wish Tableau was better in all the ways I know it could and should be, it's still the best tool ever invented for basic data analysis. Bar none.

But for how long? Tableau's opened up the door and shown the world that data isn't just something that lives in corporate closets, mines, or dungeons. People are ready for and receptive to the idea that they should be able to access and analyze their information easily and simply. The horizons are expanding and the world is primed.

Prospecting.

Now that there's a bona fide demand for simple, easy, straightforward data analysis, the question is:

Where will the next golden tool come from?

Just as Tableau appeared and ushered in a new age, there will be a tool that embraces the principles of easy, simple, straightforward use leading directly to extremely high quality analytical outcomes. One that employs these principles in the basic data analytical space, but expands the operational sphere out and beyond Tableau's ease of use horizons. This new tool will be the next quantum leap forward in data analysis. I'm looking forward to it.

The blueprints for the next golden tool, identifying what it needs to be and do, and how, are already out there, if one knows where and how to look. The only real question is: who's going to build it?

 

4 comments:

  1. Chris,

    What a great blog post! You should write a book. Your history, involvement with, and thoughts about the path of BI tools over the last 25 years is fascinating.

    I used FOCUS in the early 90’s. As a data querying noob, I loved its simplicity. It opened the door for me to start exploring data. It was, as you say, a Golden Tool. The bank I was working for at the time, which was in Horsham—not too far from Information Builders in fact—began moving away from mainframes and onto UNIX systems with Oracle data bases in the mid 90’s. At that time, the group I worked with moved to a SAS platform and used SQL to query Oracle data.

    Through the Dark Ages of BI I continued to leverage SAS for analysis and SAS and Excel for reporting. I was fortunate to have never gotten involved with the any of the ‘giants run amok’ that you refer to, although they did come knocking.

    I came to Tableau about 1 year after you did, in 2007 and boy did it pull me in. Having worked with Tableau for the last 8 years, I see it for the awesome tool it is, and for the awesome tool it isn’t. I’ve used Tableau for thousands of hours, watched and read hundreds of hours of training material in the forums, and had experts like Joe Mako, and folks from Tableau’s professional services teach me how to use Tableau.
    One of the things that frustrates me most about Tableau is its inability to give me the big picture of all of my data. What do I mean by that? Just yesterday I was doing an analysis in Tableau on some data that I’m not all that familiar with and I thought, “I need to dig into this data and quickly see what it looks like, how is it shaped, where are the outliers? I got out of Tableau and fired up JMP, from SAS.

    I’ve used JMP on and off for over 10 years. It’s awesome for data discovery. In particular, its humbly named ‘distribution’ platform is amazing. With just a few clicks, JMP shows you your data. It shows you, in histograms, box plots, and tables the shape of your data. To me, seeing all of your data is something almost no one in data analysis/reporting does because so few tools make it easy. JMP makes it easy to see all of your data, but it doesn’t make it easy to share what you see with others. Other tools, however, are quickly moving in that direction, and beyond.

    I recently watched a video of Jeff Heer showing off what he calls a ‘visual profiler’ in a tool he’s helping create named Trifacta. I’ve snipped out a clip where he shows off the visual profiler at https://youtu.be/vc1bq0qIKoA?t=284

    Heer makes a point that the way forward in information visualization is going to move us away from Designers and to Decision Makers and away from Specification towards Exploration. I do hope the people at Tableau are listening to people like Jeff Heer.

    Thank you for the thought-provoking post Chris. Keep up the good work!

    ReplyDelete
    Replies
    1. Thanks, John. There aren't all that many people out there who know about FOCUS and the history of data analysis, fewer still who can relate Tableau to it.

      I'm simultaneously amused and frustrated by the limitations imposed by a lack of awareness about the history of our profession, and the consequences of it.

      One of them is the narrow view of what data analysis could be. Tableau is based upon a particular set of concepts that constrain it within a limited horizon of possibility. I've written a fair bit about Tableau in this regard, concentrating on the needs to expand its native concept of data to include structural forms beyond flat record sets (post-access methods of joining, blending, and using Table and LOD calculations aren't native), and to reconsider what the analytical/presentation space is.

      You're correct in identifying Tableau's limitations in whole-data surveying, your "big picture of all of my data". One of the things that attracted me to Tableau when I first started using it was the ability it provided me to access and get the big picture from unfamiliar data by building views that helped me see it, including: lists of dimension members, when they're not too numerous, distinct member counts when they are; distributions of measure values, including max, min, avg, sum, etc; and a variety of others.

      Building these views is a relatively mechanical process, one that I go through as a matter of course when I come across some new data, which happens pretty much every time I start working with a new client. Frankly, it's silly that Tableau hasn't built this into the product, it would be pretty simple for them to implement a function that says in effect: point me at some data and I'll build a workbook with a survey of it, with the various parts responding to user-provided thresholds, e.g. the # of distinct members of a Dimension that determines whether I generate a list of them or not.

      I've long thought that creating this tool would be an interesting exercise, but there have been technical challenges, including/particularly dynamically generating workbooks on the fly. Easy for Tableau, not so easy for the rest of us.


      But, taking a bit of a self-serving tangent... I've recently found a way to crack this nut, or at least the part where I can now programmatically inject dashboards into existing Tableau workbooks. I'll be publishing some tools very soon that take advantage of this, and I'm targeting basic data surveying as an app shortly after that.

      Delete
  2. Do you have a comment on Actuate? Even if you don't like it, I am curious to see where it fits within your narrative.

    ReplyDelete
    Replies
    1. I've not used Actuate, so can't comment. Perhaps someone out there would like to add something.

      Delete