forums
Use these forums to ask questions and discuss Tableau.
Join problem
Hi all
I'm new to the world of SQL, and trying to sort out this problem. I have data from two different tables I'd like to present on the same worksheet, but when I join them, I'm getting the wrong result. Let me explain ...
I have a project tracking database and I'm trying to pull out budget vs actual by project. (I've simplified the table descriptions, but you'll get the idea.)
There is a table called Budgets with fields:
Project ID
Task
Person
Budget Hours
There is a table called Actuals with fields:
Project ID
Task
Person
Task Date
Actual Hours
If I have a worksheet that queries either table individually, I get the correct results. The problem is when I join them.
When I query for the Budget, I'm getting results that are way too high. When I examine the underlying data, it is showing I'm getting a budget data point returned for every record for which I have a task entry actual.
I've tried joins with ProjectID, ProjectID + TaskID and ProjectID + TaskID + Person and none of them work.
Any ideas?
Thanks
Comments
You are experiencing a common problem with using joins for reporting purposes. Since the data tables are not normalized you are getting a 'cartesian product' i.e. every possible combination of the two tables where the two joined fields are in common. Joins are primarily intended for cases where the joined field in at least one of the tables has a unique value in every record.
There is no quick fix for this. One approach is to aggregate one of your tables at a level where there is only one record for each field you want to join to the other table. I don't believe you can do this using the 'join builder' in Tableau; you will have to create an aggregated view on your database first or write custom SQL to create a subquery.
Another solution if your tables have a lot of common dimensions is to write a Union query that 'stacks' the data sets together as a concatenated set of records, padding out the measures that don't match the other table with Nulls. Then use an aggregation function to combine/compare them in your Tableau sheet. The Export > Data feature in Tableau actually can help with this if you can create an 'Export' view of each data set with harmonized field names, write each to a discrete data table in an Access DB, then write a Union query to stitch them together.
I am not trained on SQL but have learned all of these principles using MS Access. I encountered the exact same problem you are experiencing when I began to try to join tables for data analysis and reporting.