Since moving to Tableau, I haven’t had the chance to play with Denali CTP 3, which makes me a bit sad.
I decided to remedy that by testing the Tableau extract performance enhancements in 6.1 against Denali CTP3. I tested eight scenarios three times each:
- Importing a simple table using 6.0
- Importing the same simple table (with a resident COLUMN STORE index) using 6.0
- Importing a simple table using 6.1
- Importing the same simple table (with a resident COLUMN STORE index) using 6.1
- Importing a complex table using 6.0
- Importing the same complex table (with a resident COLUMN STORE index) using 6.0
- Importing a complex table using 6.1
- Importing the same complex table (with a resident COLUMN STORE index) using 6.1
The “simple” scenario consisted of importing a single 15-column fact table containing approximately 5.8M rows. As necessary, I would enable a COLUMN STORE index on all columns of the table, then issue a query and check the SHOW PLAN output to make sure the in-memory index was indeed being utilized.
The “complex” scenario utilized the same fact table, but joined it to ten other dimension tables. In this scenario, I extracted the same number of rows, but I end up with 115 columns. Here’s the SQL kicked out by Tableau:
SELECT [FactDailyResponse].[Actual_Cost] AS [Actual_Cost],
[FactDailyResponse].[Actual_Units_Count] AS [Actual_Units_Count],
[DimPlacement].[Ad_Serving_Method] AS [Ad_Serving_Method],
[DimPlacement].[Ad_Serving_Method_Code] AS [Ad_Serving_Method_Code],
[DimPlacement].[Ad_Type_Name] AS [Ad_Type_Name],
[DimClient].[AgencyTypes] AS [AgencyTypes],
[FactDailyResponse].[Avg_Position] AS [Avg_Position],
[DimDate].[BI_WEEKLY_1_DT] AS [BI_WEEKLY_1_DT],
[DimDate].[BI_WEEKLY_2_DT] AS [BI_WEEKLY_2_DT],
[DimPlacement].[Buy_Model_Name] AS [Buy_Model_Name],
[DimCampaign].[CAMPAIGN_LAST_PL_END_DT] AS [CAMPAIGN_LAST_PL_END_DT],
[DimCampaign].[CAMPAIGN_LEAST_PL_START_DT] AS [CAMPAIGN_LEAST_PL_START_DT],
[FactDailyResponse].[Calculated Cost] AS [Calculated Cost],
[DimCampaign].[Campaign_Activity_Status] AS [Campaign_Activity_Status],
[DimCampaign].[Campaign_Country] AS [Campaign_Country],
[FactDailyResponse].[Campaign_Key] AS [Campaign_Key],
[DimCampaign].[Campaign_Location] AS [Campaign_Location],
[DimCampaign].[Campaign_Name] AS [Campaign_Name],
[DimCampaign].[Campaign_Notes] AS [Campaign_Notes],
[DimCampaign].[Campaign_Region] AS [Campaign_Region],
[DimAd].[Click_Through_URL] AS [Click_Through_URL],
[FactDailyResponse].[Clicks] AS [Clicks],
[DimClient].[Client_Country] AS [Client_Country],
[DimClient].[Client_Location] AS [Client_Location],
[DimCampaign].[Client_Name] AS [Client_Name],
[DimClient].[Client_Network_Name] AS [Client_Network_Name],
[DimClient].[Client_Region] AS [Client_Region],
[DimClient].[Client_Url] AS [Client_Url],
[DimCreative].[Creative_Brand] AS [Creative_Brand],
[DimCreative].[Creative_Descriptor] AS [Creative_Descriptor],
[DimCreative].[Creative_Dimensions] AS [Creative_Dimensions],
[DimCreative].[Creative_File_Name] AS [Creative_File_Name],
[DimCreative].[Creative_File_Size] AS [Creative_File_Size],
[DimCreative].[Creative_Group_1] AS [Creative_Group_1],
[DimCreative].[Creative_Group_2] AS [Creative_Group_2],
[DimCreative].[Creative_Message] AS [Creative_Message],
[DimCreative].[Creative_Name] AS [Creative_Name],
[DimCreative].[Creative_Offer] AS [Creative_Offer],
[DimCreative].[Creative_Product] AS [Creative_Product],
[DimClientToPlacement].[Creative_Type] AS [Creative_Type],
[DimAd].[DFA_Ad_Name] AS [DFA_Ad_Name],
[DimAd].[DFA_Buy_Name] AS [DFA_Buy_Name],
[DimAd].[DFA_Creative_Name] AS [DFA_Creative_Name],
[DimAd].[DFA_Keyword] AS [DFA_Keyword],
[DimAd].[DFA_Section_Name] AS [DFA_Section_Name],
[DimAd].[DFA_Site_Name] AS [DFA_Site_Name],
[DimDate].[Date_As_Int] AS [Date_As_Int],
[DimDate].[Day_Of_Month_No] AS [Day_Of_Month_No],
[DimDate].[Day_Of_Week_Short] AS [Day_Of_Week_Short],
[DimDate].[Day_Of_week_Name] AS [Day_Of_week_Name],
[DimClient].[Client_Name] AS [DimClient_Client_Name],
[FactDailyResponse].[Direct Itinerary Tix] AS [Direct Itinerary Tix],
[FactDailyResponse].[Direct PNR] AS [Direct PNR],
[FactDailyResponse].[Direct Sales] AS [Direct Sales],
[DimEvent].[Event_Type_Name] AS [Event_Type_Name],
(CASE
WHEN 0 = ISDATE(CAST([DimDate].[Full_SQL_Date] AS VARCHAR)) THEN NULL
ELSE DATEADD(day, DATEDIFF(day, 0, CAST(CAST([DimDate].[Full_SQL_Date] AS VARCHAR) as datetime)), 0) END) AS [Full_SQL_Date],
[FactDailyResponse].[Impressions] AS [Impressions],
[FactDailyResponse].[Indirect Itinerary Tix] AS [Indirect Itinerary Tix],
[FactDailyResponse].[Indirect PNR] AS [Indirect PNR],
[FactDailyResponse].[Indirect Sales] AS [Indirect Sales],
[DimPlacement].[Is_Archived] AS [Is_Archived],
[FactDailyResponse].[Is_In_Acquisition] AS [Is_In_Acquisition],
[DimPlacement].[Is_Paid] AS [Is_Paid],
[DimEvent].[Last_Load_Dt] AS [Last_Load_Dt],
[DimDate].[Month_Dt] AS [Month_Dt],
[DimDate].[Month_Name] AS [Month_Name],
[DimDate].[Month_Of_Year_No] AS [Month_Of_Year_No],
[DimDate].[Month_Short] AS [Month_Short],
[DimDate].[Month_of_Quarter_No] AS [Month_of_Quarter_No],
1 AS [Number of Records],
[DimAd].[PARSED_COUNTRY] AS [PARSED_COUNTRY],
[DimAd].[PARSED_KEYWORD] AS [PARSED_KEYWORD],
[DimAd].[PARSED_PRODUCT] AS [PARSED_PRODUCT],
[DimAd].[PARSED_STATE] AS [PARSED_STATE],
[DimAd].[PROMO] AS [PROMO],
[DimPlacement].[Package_Name] AS [Package_Name],
[DimPlacement].[Placement_Dimensions] AS [Placement_Dimensions],
[DimPlacement].[Placement_End_Dt] AS [Placement_End_Dt],
[DimPlacement].[Placement_Name] AS [Placement_Name],
[DimPlacement].[Placement_Start_Dt] AS [Placement_Start_Dt],
[DimPlacement].[Platform_Name] AS [Platform_Name],
[DimPlacement].[Pricing_Type_Name] AS [Pricing_Type_Name],
[DimDate].[QuarterName] AS [QuarterName],
[DimDate].[Quarter] AS [Quarter],
[DimResponseType].[Response_Group] AS [Response_Group],
[DimResponseType].[Response_Type] AS [Response_Type],
[DimEvent].[Spotlight_Category] AS [Spotlight_Category],
[DimEvent].[Spotlight_Name] AS [Spotlight_Name],
[DimEvent].[Spotlight_Note] AS [Spotlight_Note],
[DimEvent].[Spotlight_Type] AS [Spotlight_Type],
[DimEvent].[Spotlight_role] AS [Spotlight_role],
[DimPlacement].[Target_Market] AS [Target_Market],
[DimCreative].[Technology_Type_Name] AS [Technology_Type_Name],
[DimPlacement].[Unit_Type] AS [Unit_Type],
[DimVendor].[Vendor_Name] AS [Vendor_Name],
[DimVendor].[Vendor_URL] AS [Vendor_URL],
[DimDate].[WEEK_KEY] AS [WEEK_KEY],
[DimDate].[Week_Start_Mon_Dt] AS [Week_Start_Mon_Dt],
[DimDate].[Week_Start_Sun_Dt] AS [Week_Start_Sun_Dt],
[DimDate].[Year-Quarter Key] AS [Year-Quarter Key],
[DimDate].[Year-Quarter] AS [Year-Quarter],
[DimDate].[Year-Week Key] AS [Year-Week Key],
[DimDate].[Year-Week] AS [Year-Week],
[DimDate].[Year] AS [Year],
[DimClientToPlacement].[Buy_Model_Name] AS [dimClientToPlacement_Buy_Model_Name],
[DimClientToPlacement].[Campaign_Name] AS [dimClientToPlacement_Campaign_Name],
[DimClientToPlacement].[Client_Name] AS [dimClientToPlacement_Client_Name],
[DimClientToPlacement].[Client_Network_Name] AS [dimClientToPlacement_Client_Network_Name],
[DimClientToPlacement].[DFA_Site_Name] AS [dimClientToPlacement_DFA_Site_Name],
[DimClientToPlacement].[Placement_Name] AS [dimClientToPlacement_Placement_Name],
[DimClientToPlacement].[Target_Market] AS [dimClientToPlacement_Target_Market],
[DimEvent].[tag_method] AS [tag_method],
[DimCreative].[Client_Name] AS [view_DimCreative_Client_Name],
[DimCreative].[Creative_Type] AS [view_DimCreative_Creative_Type],
[DimPlacement].[Client_Name] AS [view_DimPlacement_Client_Name]
FROM [dbo].[view_FactDailyResponse] [FactDailyResponse]
INNER JOIN [dbo].[DimAd] [DimAd] ON ([FactDailyResponse].[Ad_Key] = [DimAd].[Ad_Key])
INNER JOIN [dbo].[DimEvent] [DimEvent] ON ([FactDailyResponse].[Event_Key] = [DimEvent].[Event_Key])
INNER JOIN [dbo].[view_DimCampaign] [DimCampaign] ON ([FactDailyResponse].[Campaign_Key] = [DimCampaign].[Campaign_Key])
INNER JOIN [dbo].[DimClient] [DimClient] ON ([FactDailyResponse].[Client_Key] = [DimClient].[Client_Key])
INNER JOIN [dbo].[view_DimPlacement] [DimPlacement] ON ([FactDailyResponse].[Placement_Key] = [DimPlacement].[Placement_Key])
INNER JOIN [dbo].[dimClientToPlacement] [DimClientToPlacement] ON ([FactDailyResponse].[Creative_Key] = [DimClientToPlacement].[Creative_Key])
INNER JOIN [dbo].[view_DimCreative] [DimCreative] ON ([FactDailyResponse].[Creative_Key] = [DimCreative].[Creative_Key])
INNER JOIN [dbo].[DimResponseType] [DimResponseType] ON ([FactDailyResponse].[Response_Type_Key] = [DimResponseType].[Response_Type_Key])
INNER JOIN [dbo].[DimVendor] [DimVendor] ON ([FactDailyResponse].[Vendor_Key] = [DimVendor].[Vendor_Key])
INNER JOIN [dbo].[view_DimDate] [DimDate] ON ([FactDailyResponse].[Date_Key] = [DimDate].[Date_Key])
I did this work on my personal laptop – a 6 GB i7 machine. Tableau was installed on the local machine, while SQL Server ran on a VirtualBox VM with 3 cores and 2 GB of RAM. During tests, sqlserver.exe never consumed more than about 750 MB of RAM, so it didn’t appear memory was ever a constraint.
It’s clear that 6.1 is good stuff in both complex and simple scenarios. Denali’s COLUMN STORE index does seem to impact loading positively, too – as long as the query is relatively complex vs a situation where we’re doing a single table scan.
Keep in mind that these deductions come from a grand total of 3 tests per case – so time allowing, I’ll do some more. Also, the usual disclaimers apply – this is not official guidance from Tableau or Microsoft and your mileage may vary signifcantly.