Wouldn’t it be nice if you could have a single place to quickly monitor your Redshift cluster’s performance? Even better, wouldn’t be nice if that place was Tableau so you can drill and play at will?
An acquaintance from AWS turned me on to a project that captures Redshift performance metrics and stores them in CloudWatch via a Lambda function. Cool idea, but we’d rather have a mechanism that makes this stuff available in a database that Tableau can connect to directly.
So, we’re building it.
We hope to release the project to the community around Tableau Conference time. We’re pretty much leveraging conventional wisdom around Redshift performance (in the form of perf-related queries others have already written). We’re just putting it all in one place without the copying and pasting on your part.
A very basic prototype is done, and I need some folks to set it up, report the voluminous errors I expect to see during the first few weeks, and give feedback about what else would be nice to see.
Hell, build me a cool dashboard or two and you can be a co-author.
How it works:
- You stand up a small RDS PostgreSQL instance and run a script to create a database and table in said instance of pgsql
- You create a Lambda function with a zip file I provide. The Lambda function connects to Redshift, grabs performance metrics, and drops them into RDS.
- You add a trigger to the Lambda function to execute every hour
- You use Tableau Desktop to point the sample workbook I provide at RDS and go to town
You should be:
- Kind of technical. You should be able to handle the activities above by using Help with little-to-no support from yours truly. I’ll whip up a simple setup doc, but it’ll assume you are self-sufficient.
- Able & willing to collect / provide error logs out of CloudWatch. I’ll document how if you don’t know how already.
- Someone who likes to provide feedback and bring ideas to the table.
In a perfect world, you’ll also be willing to give me direct access to your RDS instance so I can easily see what “real customer” output looks like. My biggest challenge up to this point is getting enough real-world data points to make something that is cool and useful. Know a little bit of JavaScript? You’ll be my bestie – the tool is written with NodeJS and you can hack at it along with me.
Here are a few quick and dirty dashboards to give you a feel for the type of things we’re up to…
Below, we’re monitoring average and maximum query runtime on Redshift. We’re also keeping an eye on whether queries are spilling to disk, if your queries are queuing, and whether vacuum operations are occurring during the day.
This one is tracking the most “expensive” queries over the last 7 days. It also is watching for Redshift Performance Alerts and surfacing them to you, along with the ID of the offending query and what to do about it.
We also have vizzes tracking skew, tables without statistics, metrics around HOW things are being filtered (so you know where to add your sort keys), etc. Anyway, you’ll see.
If you’re interested, please use the form behind the Contact menu at the top of the page to tell me who you are and what you’d like to do. Thanks!