NOTE: If you’ve just landed on this article and haven’t read this entry, you really should stop and give it a look. It contains important information I won’t be re-visiting here.
Overview
No surprise: you’re about to see very close correlation between Tableau Desktop’s performance and how Server behaves.
The data below comes directly from the admin View Performance History report – specifically Compute View.
I installed Server to 6 of the 10 AWS instances I tested previously and published the same workbook as before: This is the “big” workbook with the heavy-duty data source. I ran each Tableau Server with only one VizQL process so that it was easy to see caching in play.
Keep in mind that most of the work we’re doing here assumes the need for good IO due to the large extract that needs to be read into RAM as quickly as possible. Do you really need to run on an instance with SSDs or lots of provisioned IOPS? Maybe not, if you you don’t need the fast read for extract loading.
I tested two reports:
- “Big extract” Dashboard (mentioned above)
- The sample 2013 Sales Growth dashboard
I consider the second to be the day-to-day type of report most folks will be viewing on a daily basis.
The general pattern I followed while testing each instance was:
- Login and run each report with NO tableau caches populated: VizQL hasn’t seen these reports yet, nor has the extract been loaded into RAM at the OS level.
- Logout, close browser.
- Login, run each report again. Fast response expected because we’ll likely go to the tile cache or VizQL Server cache
- Logout, and use task manager to kill the single VizQL Server process running: I’m purposefully blowing up anything in the VizQL Server cache.
- Login, run each report again. Performance worse than the previous execution, but still better than the first.
- Restart the machine to clear the extract from the OS’s RAM and repeat steps 1-6 until i start feeling silly
The report we really care about is “big”, and I followed the same steps with it on my externally facing Tableau Server (4 core i5 CPU, 32 GB RAM, SSD) and Server running under Parallels on my Mac (resources granted to VM: 4 core i7 CPU, 4 GB RAM, backed by SSD). I brought both down to 1 VizQLServer for the test.
External Server:
- Initial Load: 34.39 sec
- Cached Load: .55 sec
- Post VizQL Server reset: 27.4
Parallels VM on Mac:
- Initial Load: 17.14 sec (faster than I expected!)
- Cached Load: .11 sec
- Post VizQLServer reset: 17.48
And now, the results:
IMPORTANT NOTE: I’m using 10,000 IOPS as a stand-in “tag” for Instance Storage on an SSD. I did not create an EBS volume that delivered 10,000 provisioned IOPS. Instance Storage is explained in the previous blog entry on this topic. Where you see 0 IOPS, I’m using a standard volume, also discussed earlier.
Lets start with the 2013 Sales Growth dashboard. As you can see, Tableau makes quick work of this baby across the board on all hardware:
Since these are going to render relatively quickly anyway, it is difficult to tell how much the cache is or isn’t coming into play. I’d assume the < 1 second executions are coming out of tile cache, the 1-2 second renders from the VizQLServer cache, and the 3+ second executions are being done “from scratch” – but it’s hard to say for sure.
Note that our “big” report can also come back very quickly if cached correctly. Yowza!
It is very easy to see caching at work when rendering the “big” dashboard…here, I’ll make it even easier!:
We’re able to deliver sub-second performance on this big report in the green zone – that’s gotta be the tile cache at work.
In the yellow zone, we’re generally spending ~50-60% or less time rendering a report than in the red zone on the same type of machine. In the red zone, we’re having to wait till the OS loads the extract into RAM, in the yellow zone we’re not, plus we’re probably getting some love from VizQL Server caching.
Here’s the same information expressed in a slightly different way:
…the Upper Whisker always is displaying the longest, “nothing is cached” load, while the Lower Whisker shows a fully cached execution.
And finally, it might be fun to overlay both the Desktop data set and the Server data set to see any differences:
Yet another note: I “hand-timed” rendering of reports in Desktop – just me, my Nokia, and a stopwatch app. I noticed that there was often a good half-a-second+ difference between what my stopwatch was ticking off and what Tableau’s “executing query” dialog read. So, when you see only a second-ish difference between a Desktop and Server reading on the same hardware, you probably should just consider it a wash.
If you gift Desktop an extra second due to my arthritic stopwatch thumb, I think you’ll find the rendering times are remarkably similar between Server and her thick-client cousin.
I was frankly surprised to see that Server fared better than Desktop on the m3.large instance (2 cores @ 2.5 Ghz, 900 IOPS). Can’t explain it, but I don’t see anyone really wanting to run server on a 2 core box anyway so I’m not going to worry about it.
One other common-sense thing that is probably worth mentioning – the cradle-to-grave experience a user goes through encapsulates “rendering” but also some other server-ish processing that can add some overhead. Don’t always assume a 0.27 second execution displays in the browser in less than a second for the user.
Lessons Learned:
- What’s good for Desktop on EC2 is good for Server
- CPU is most important once data has been retrieved from the data source
- If your data source is an extract, make sure to have decent disk performance or the CPU can’t get to work