Assuming Referential Integrity for Joins
In some cases, you can improve query performance for joins by selecting the option to Assume Referential Integrity from the Data menu. When you use this option, Tableau will include the joined table in the query only if it is specifically referenced by fields in the view.
Using this setting is appropriate when you know that your data has referential integrity (see definition below) but your database is not enforcing or cannot enforce referential integrity. If you have the option of configuring referential integrity in your database that is a better option than using this setting because it can improve performance both in the database and in Tableau. The Assume Referential Integrity option in Tableau can only affect performance on Tableau's end. If your data does not have referential integrity and you turn on this setting, query results may not be reliable.
To understand what referential integrity is, imagine connecting to sales data that has two tables: Sales and Product Catalog. These two tables are shown below:
Sales
|
Product Catalog
|
Because all products that are sold must have a listing in the Product Catalog, every row in the Sales table has a matching row in the Product Catalog table. When these two tables are joined on Product ID, you end up with a table that looks like this:
Product ID | Product Name | Product ID | Sale Amount | Transaction Date |
1 | 10 Inch Tablet | 1 | 100 | 10/1/2012 |
1 | 10 Inch Tablet | 1 | 2000 | 10/2/2012 |
2 | Smart Phone | 2 | 50 | 9/30/2012 |
3 | Desk Lamp | 3 | 10 | 8/21/2012 |
Now let's say you build a view to look at Sale Amount by Region. By default, dragging the Sale Amount field to the view may create a query like this:
SELECT SUM([Sales Amount]) FROM [Sales] S INNER JOIN [Product Catalog] P ON S.ProductID = P.ProductID
By selecting Assume Referential Integrity, you tell Tableau that the joined tables have referential integrity. In other words, you are confirming that the Sales table will always have a matching row in the Product Catalog table. Because that is true, Tableau doesn't need any information from the Product Catalog table in order to return these results. When you drag the Sales Amount field into the view, Tableau can simplify the query to:
SELECT SUM([Sales Amount]) FROM [Sales]
This simplified query can often return quicker results because it removes the join operation. This option impacts only inner joins and does not affect data sources with a single table.