Here’s a trick taken from Pawel Terlecki’s TC presentation at the Tableau Conference… He mentioned this nugget in passing, and I thought it would be interesting to make it a bit more concrete.
Below, we have a (very) quick and dirty dashboard based on a 50M+ row extract-based Super Store sample dataset. It plots five measures over time and then rolls the same measures up in the labels on the right. This sucker took 5.188 seconds to render out of the box…Not too shabby!
However, we can make this puppy run over 30% faster (33.25%, actually). And we can do so with a very simple change.
Here’s a performance recorder report for the original, “slow” dashboard. Note we have one longer query (that’s the line chart), and five shorter ones for each of the labels.
Note the order of execution. The labels are rendered first, followed by the line chart.
That’s kind of a waste, isn’t it? If we can manage to have the line chart execute first, then we have all the measure values we need in cache to rollup the label values without hitting the original data source again.
Of course, we can manage this. I wouldn’t be blogging on same otherwise.
Here is the performance recorder report for the exact same views added to the dashboard in a different order.
Here’s the fun bit – check out all that yummy caching!
See how we’re hitting cache for all of our labels?
This one took 3.463 seconds to run and the order of execution is clearly different. We hit the “big” view first, then take advantage of the cache for the smaller “label” views.
Accomplishing this is trivial. Just remember that the order in which views are added to the dashboard will set the order in which they execute at runtime. In previous versions of Tableau (pre-8.2) the name of the view drove the behavior: View “A” executed before “B”, “B” executed before “C” and “Z”.
However. in 8.2 something (don’t know what, sorry) changed and now “order of addition” drives the behavior. Perhaps execution order will change back to alphabetical some day in the future? Who knows. Something to keep in mind though.
Congrats – you’ve just learned a great design pattern to test out for yourself – execute a view which contains dimensions and/or measures used by other vizzes FIRST. Those other vizzes will often be able to take advantage of the work done by what came first.
Is this still relevant in the current versions of Tableau, or is this now optimized automatically under the hood?
Since we now do parallel query execution, this specific scenario wouldn’t really benefit from the cache. The “manual” re-ordering that we do in this post is something that we also (sort of) automagically to do in later versions.