Improve Performance for Cross-Database Joins

Important: This feature temporarily moves data outside of Tableau. Be sure that the database you’re connected to is from a trusted source.

You can improve performance when joining data from a single file and a single database by allowing Tableau to perform the join using the database instead of Hyper. When this option is enabled, Tableau chooses the fastest option (Hyper or the connected database). If Tableau uses the connected database, the data from the file connection is moved into temporary tables in the database and the join is performed there.

Feature conditions

This option is only available if the following conditions are met:

  • The data source consists of one or more file-based connections and a single SQL-based connection.
  • The file is a Microsoft Excel, PDF, or Text (.csv, .txt, .tsv or .tab) file type.
  • The connected database is one of the following:
    • Microsoft SQL Server
    • Oracle
    • PostgreSQL
    • Vertica
    • Teradata
  • The join type is an inner join.
  • In web authoring: The Allow users to use web authoring option is enabled.

Overriding feature conditions

As an administrator, you can override the file size, join type, and single file connection limitations and force Tableau to use the live database to perform the join. This enables you to experiment and determine optimal performance configurations. The single database connection requirement still applies.

To enable this option enter the following information from the command line:

  • Tableau Desktop: Enter the command tableau.exe -DForceAlternativeFederationEngine=true
  • Tableau Server: Enter the tsm configuration command tsm configuration set -k native_api.force_alternative_federation_engine -v true

    For more information about setting configuration values in Tableau Server, see tsm configuration set Options(Link opens in a new window) in the Tableau Server help.

Enable the performance option for cross-database joins

  1. Connect to the first data source.
    • In Tableau Desktop: On the Start page, under Connect, connect to a supported file type or supported database type. This step creates the first connection in the Tableau data source.
    • In web authoring: From the Home or Explore page, click Create> Workbook to start a new workbook and then connect to your data. This step creates the first connection in the Tableau data source.
  2. Select the file or database that you want to connect to, then double-click or drag a table to the canvas.

  3. In the left pane, under Connections, click the Add button ( in web authoring) to add your second connection to the Tableau data source.

    The Cross-database join option is displayed.

    Note: If you don't see this option, check that you're using only supported data sources and that you have only two data sources (one file and one database type). Otherwise, the Site Administrator may have set the Cross-Database Joins configuration option to Tableau only.

  4. To change how Tableau performs the join, next to the Cross-database join option, click Edit.
  5. In the Cross-Database Join dialog, select one of the following options, then click OK:
    • Use Tableau or existing databases. This option allows Tableau to choose the fastest option to perform the join - either Hyper or the database you're connected to.
    • Use Tableau only. This option is the default and always uses Hyper to perform the join.

    The Cross-database join option changes from the default option, Using Tableau (using Hyper), to the new option Using your database, depending on what you choose.

    Important: If you select Use Tableau or existing databases, Tableau chooses the fastest option when performing the join. This behavior is pre-determined by a set of criteria including join types. For instance, Tableau always chooses Hyper for non-inner joins.

    If Tableau uses Hyper to perform the join, this process happens in the background and no indicator is shown to identify where the join was performed.

  6. Add one or more join clauses by selecting a field from one data source, a join operator, and a field from the added table. Inspect the join clause to make sure it reflects how you want to connect the tables.

About working with multi-connection data sources

Working with multi-connection data sources is just like working with any other data source, with a few caveats, discussed in this section.

Union data from within a connection

To union data, you must use text tables or Excel tables from the same connection. That is, you can't union tables from different databases. In Tableau Desktop, you can union tables across different Excel workbooks and files in different directories. For more information, see the Union tables using wildcard search (Tableau Desktop).

If you need to union data from different databases, use Tableau Prep(Link opens in a new window).

Collation

Collation refers to the rules of a database that determine how string values should be compared and sorted. Usually, the collation is handled by the database. However, when you work with cross-database joins, you might join columns that have different collations.

For example, suppose your cross-database join used a join key comprised of a case-sensitive column from SQL Server and a case-insensitive column from Oracle. In cases like this, Tableau maps certain collations to others to minimize interpreting values incorrectly.

The following rules are used in cross-database joins:

  • If a column uses collation standards of the International Components for Unicode (ICU), Tableau uses the collation of the other column.
  • If all columns use collation standards of the ICU, Tableau uses the collation of the column of the left table.
  • If no columns use collation standards of the ICU, Tableau uses a binary collation. A binary collation means the locale of the database and data type of the columns determine how string values should be compared and sorted.

Maintain case sensitivity for Excel data

If you need to maintain case sensitivity for your Excel data when performing joins, enable the Maintain Character Case (Excel) option from the Data menu.

When this option is selected, Tableau maintains the casing and uniquely identifies values with different casing instead of combining them, resulting in a different number of rows.

For example, consider one worksheet with "House" and another with "house" and "HOUSE". By default, Tableau ignores the casing and considers all three variations of "house" as the same. With the Maintain Character Case (Excel) option enabled, when you join your tables, Tableau preserves the character casing differences. "House", "house", and "HOUSE" are treated as different values.

Note: This option is available for all Tableau supported languages and isn't dependent on the locale of your operating system. This option is only available for Microsoft Excel data sources.

Calculations and multi-connection data sources

Only a subset of calculations can be used in a multi-connection data source.

  • In Tableau Desktop: You can use a specific calculation if it's both:
    • Supported by all the connections in the multi-connection data source
    • Supported by Tableau extracts.
  • In web authoring (Tableau Cloud and Tableau Server): You can use a specific calculation if it's supported by all the connections in the multi-connection data source.

Stored procedures

Stored procedures aren't available for multi-connection data sources.

Pivot data from within a connection

To pivot data, you must use text columns or Excel columns from the same connection. That is, you can't include columns from different databases in a pivot.

Make extract files the first connection (Tableau Desktop only)

When connecting to extract files in a multi-connection data source, make sure that the connection to the extract (.hyper) file is the first connection. This preserves any customizations that might be a part of the extract, including changes to default properties, calculated fields, groups, aliases, and so on.

Note: If you must connect to multiple extract files in your multi-connection data source, only the customizations in the extract in the first connection are preserved.

Extracts of multi-connection data sources that contain connections to file-based data (Tableau Desktop only)

If you're publishing an extract of a multi-connection data source with file-based data such as Excel, selecting the Include external files option copies the file-based data as part of the data source. In this case, a copy of your file-based data can be downloaded and its contents accessed by other users. If there's sensitive information in the file-based data that you’ve intentionally excluded from your extract, don't select Include external files when you publish the data source.

For more information about publishing data sources, see Publish a Data Source.

About queries and cross-database joins

For each connection, Tableau sends independent queries to the databases in the join. The results are stored in a temporary table, in the format of an extract file.

Important: Cross-database joins may move data between databases. Be sure the databases you're joining are trusted sources.

For example, suppose you create connections to two tables, dbo.listings and reviews$. These tables are stored in two different databases, SQL Server and Excel. Tableau queries the database in each connection independently. The database performs the query and applies customizations such as filters and calculations, and Tableau stores the results for each connection in a temporary table. In this example, FQ_Temp_1 is the temporary table for the connection to SQL Server and FQ_Temp_2 is the temporary table for the connection to Excel.

SQL Server table

Excel table

When you perform a cross-database join, the temporary tables are joined by Tableau Desktop. These temporary tables are necessary for Tableau to perform cross-database joins.

After the tables have been joined, a Top N filter is applied to limit the number of values shown in the data grid to the first 1,000 rows. This filter is applied to help maintain responsiveness of the data grid and the overall performance of the Data Source page.

Joined tables

Thanks for your feedback!Your feedback has been successfully submitted. Thank you!