EDIT #1: Err, or not. Data Server doesn’t cache after all.
Based on what I was told I ran a little experiment and didn’t see what I expected (Data Server wasn’t caching), then went digging further. Pinged one of our developers, and he confirmed that we do not do this yet.
Keeping the original post “up” so that anyone I’ve managed to confuse can come back here for follow-up and see the update.
Sorry, folks. As RR said, “Trust, but verify” (…“before you go blogging and look silly” – that part didn’t make it into the press conference)
EDIT #2 Ha! Data Server does cache! But only a tiny little bit in ways that will be pretty difficult for you to take advantage of. Just forget you saw this post. Really.
Why? 5 letters for you: C-A-C-H-E.
As of Tableau Server 8.0, the Data Server processes also have the same capability to cache data as VizQL Server processes do. This is actually quite cool, if you think about it:
I ask a question:
…And the answer is cached by a VizQL Server Process (with fairly well-known exceptions based on the use of user-filters)If someone lands on the same VizQL Server Process with a like question…Bingo! Cache Hit!If the question goes to another VizQL Server Process, then (sigh), a cache miss. We go to the database.
BUT…What if the report is based off of a Data Server data source? Well then, fair reader – The VizQL Server Process that just had the cache miss can STILL get the data it needs from the Data Server’s cache w/o a real round-trip to the database. Everybody wins! Life is good.
Don’t. Do. This.
And what do the fine folks who speak on Tableau Server performance say about running too many VizQL Server processes on your machine just because you can? They say “No”.. Why? Because you artificially lower your cache hit ratio by spreading out “answers” across more processes rather than cramming them all into a couple places.
Do the same rules apply for Data Server? You betcha.
If you NEED more than one or two Data Server processes, by all means, run them. But generally, you don’t. From help:
http://onlinehelp.tableausoftware.com/current/server/en-us/help.htm#perf_extracts_view.htm
“The user loads for the application server and data server processes can typically be handled by 1 process each but they are set to 2 to provide redundancy.”
It’s actually ironic that the same help topic shows 4 data servers running on a 2-node highly available implementation. Overkill.
So, campers – repeat after me “Data Server processes can cache, so I will treat them with the respect they deserve!”