History Table newbie? Start here.
In previous posts you’ve learned why you might want use the history tables, which are the “most important”, and how to use them.
Now, we cover everything else.
In this entry the main takeaway I want to leave you with has little to do with the tables we’re about to discuss. I want you to be able to JOIN these suckers together correctly.
Alan Smithee pointed something out to me that I hadn’t considered, resulting in this advice:
You’re probably going to be OK doing simple INNER JOINs when dealing with just a few history tables. However, when you start JOINing a couple more (as we’re about to do), INNER JOINs won’t cut it. Instead, make sure you’re using LEFT JOINs so that you don’t inadvertently filter out rows in the historical_events table that don’t match anything in the other tables we’re about to touch on. That would be bad.
And now, more tables.
There is a medium-sized pool of tables that do exactly what you think they do. They hold no surprises.
Shows all comments and edited comments added to visualizations on the server
Lists all groups created on the server. Currently only tracks group creation events, not edits to the group (example: changing the name of a group will not cause this table to change)
Shows licensing roles which have been utilized on server: Unlicensed, Guest, Viewer, Interactor. A row will not appear until a user is licensed using the role in question.
Records info on projects which have been created, edited, and deleted.
Records information about sites which have been created, edited, and deleted.
Keep in mind that all of these tables focus on recording and storing history. So, if someone adds a comment like “The stock ticker is $DABA”, we add a row to hist_comments. When the user realizes they misspelled the Tableau stock ticker and changes the comment to “The stock ticker is $DATA”, we add another row. We don’t just overwrite the first.
History, baby, history:
Another example: I create a site called “Hey There”, and then rename it to “Bye There”:
The next group of tables are also pretty vanilla, except for a revision field:
As users overwrite an existing copy of a data source, view, or workbook, the revision field gets incremented. Here I am repeatedly overwriting a single workbook with a different version as I test the functionality of Parallel Dashboard rendering (a cool v8 feature):
The hist_datasources table contains data source-specific information such as the source’s name and size (if an extract is in play)
View (worksheet) information
Two more tables to go – these are a little bit more interesting.
hist_users contains a few fields which need some additional discussion.
Since a user can be a member of one or more multi-tenant sites, you’ll see two User ID fields in each row. system_user_id is the unique identifier for the user inside server itself. user_id is the id for the same user in a particular site. If the user is a member of two sites, there will be two rows. Each of the two rows will have the same system_user_id, but different user_id values.
There are also some some key-value pairs defined below that are sort of fun. These will tell you if the user is a System or Content (site) administrator and/or if the user has Publisher permissions.
Ah, hist_schedules. It is chock full of information, some of which isn’t immediately human-readable.
The scheduled_action field will show either 0 or 1. 0 = Extract/Refresh, 1 = Subscription.
The day_of_week_mask field contains a masked number which tells us what days the schedule in question will fire. Here’s the secret formula:
Su = 1
M = 2
T = 4
W = 8
Th = 16
Fr = 32
Sa = 64
So, a A schedule which fires every Monday and Sunday will have a day_of_week_mask of 3 (2 + 1). A schedule which fires every weekday will have a day_of_week_mask of 62 (2 + 4 + 8 + 16 + 32)
The day_of_month_mask does the same thing, but using exponents. The value stored here will be 2 raised to the power of (day of month – 1)
A schedule which fires on the 10th of the month will have a mask of 512: 2^9
A schedule which fires on the 23rd of the month will have a mask of 4,194,304: 2^22
minute_interval is much more simple. It contains the minute value a schedule will fire. If an hourly schedule is set to execute every 15 minutes, you’ll see 15. If another schedule is set to fire every two hours, you’ll see 120.
Finally, end_at_minute tells us the minute of the day when an hourly schedule will STOP firing if the schedule has been configured to only run during a certain part of the day. For example, an hourly schedule that runs only between 12 AM and 6 AM will show 0 for start_at_minute and 360 for end_at_minute.
That’s it. We’re done. If you’ve read this far you’re a masochist. Really. You need help.
But, you’re a masochist about to get a reward. Here is a zip file with all artifacts we’ve discussed thus far, including some notes that I took while exploring.
Use it in good health!