Create Extracts on the Web

You can extract your data sources in the web (without using Tableau Desktop) to improve data source performance and support additional analytical functions. When you extract your data source, Tableau will copy the data from your remote data store to Tableau Server or Online. To learn more about the benefits of extracting your data, see Extract Your Data. In the web, you can extract while in Web Authoring or while in Content Server.

Create extracts in Web Authoring

You can create extracts directly in web authoring with default extract settings.

Extract an Embedded Data Source in Web Authoring

Data source page showing the extract connection type

To create an extract in web authoring:

Tip: It is recommended to finalise your data model before you create the extract. Extract creation may take a long time and any changes to your data model, such as adding new logical tables, will invalidate the extract.

  1. Click on the Data Source tab in the bottom-left corner of the web authoring pane. For new workbooks, you will start in the Data Source tab.
  2. In the top-right corner, change the connection type from Live to Extract.
  3. Click Create Extract. You will see the Creating Extract dialog box.

Extract creation might take a long time and you can close your authoring session while the extract is being created. To ensure your extract creation is not lost, in the dialog box, click Notify Me When Complete to specify a location for the extracted workbook to be saved. If your extract succeeds, your workbook will be saved to the specified location and you will be notified that you can continue your web authoring session. If your extract creation fails, you will be notified that the extract could not be created and you can restore your unsaved changes by reopening the original workbook in web authoring.

Define your Extract Settings

Extract Data page showing the selection for logical or physical tables

Optionally, configure one or more of the following options to tell Tableau how to store, define filters for, and limit the amount of data in your extract:

  • Decide how the extract data should be stored

    You can choose to have Tableau store the data in your extract using one of two structures (schemas): logical tables (denormalised schema) or physical tables (normalised schema). For more information about logical and physical tables, see The Tableau Data Model.

    The option you choose depends on what you need.

    • Logical Tables

      Stores data using one extract table for each logical table in the data source. Physical tables that define a logical table are merged and stored with that logical table. For example, if a data source was made of a single logical table, the data would be stored in a single table. If a data source was made of three logical tables (each containing multiple physical tables), the extract data would be stored in three tables – one for each logical table.

      Select Logical Tables when you want to limit the amount of data in your extract with additional extract properties like extract filters, aggregation, Top N or other features that require denormalised data. Also use when your data uses pass-through functions (RAWSQL). This is the default structure Tableau uses to store extract data. If you use this option when your extract contains joins, the joins are applied when the extract is created.

    • Physical Tables

      Stores data using one extract table for each physical table in the data source.

      Select Physical Tables if your extract is comprised of tables combined with one or more equality joins and meets the conditions for using the Physical Tables option listed below. If you use this option, joins are performed at query time.

      This option can potentially improve performance and help reduce the size of the extract file. For more information about how Tableau recommends you use the Physical Tables option, see Tips for using the Physical Tables option in the Tableau Desktop help. In some cases, you can also use this option as a workaround for row-level security. For more information about row-level security using Tableau, see Restrict Access at the Data Row Level in the Tableau Desktop help.

      Conditions for using the Physical Tables option

      To store your extract using the Physical Tables option, the data in your extract must meet all of the conditions listed below.

      • All joins between physical tables are equality (=) joins
      • Data types of the columns used for relationships or joins are identical
      • No pass-through functions (RAWSQL) used
      • No incremental refresh configured
      • No extract filters configured
      • No Top N or sampling configured

      When the extract is stored as physical tables, you cannot append data to it. For logical tables, you can't append data to extracts that have more than one logical table.

    Note: Both the Logical Tables and Physical Tables options only affect how the data in your extract is stored. The options do not affect how tables in your extract are displayed on the Data Source page.

  • Determine how much data to extract 

    Click Add to define one or more filters to limit how much data gets extracted based on fields and their values.

  • Aggregate the data in the extract 

    Select Aggregate data for visible dimensions to aggregate the measures using their default aggregation. Aggregating the data consolidates rows, can minimise the size of the extract file, and increase performance.

    When you choose to aggregate the data, you can also select Roll up dates to a specified date level such as Year, Month, etc. The examples below show how the data will be extracted for each aggregation option you can choose.

    Original data Each record is shown as a separate row. There are seven rows in your data.
    Aggregate data for visible dimensions

    (no roll up)

    Records with the same date and region have been aggregated into a single row. There are five rows in the extract.
    Aggregate data for visible dimensions
    (roll up dates to Month)
    Dates have been rolled up to the Month level and records with the same region have been aggregated into a single row. There are three rows in the extract.
  • Choose the rows to extract

    Select the number of rows you want to extract.

    You can extract All rows or the TopN rows. Tableau first applies any filters and aggregation and then extracts the number of rows from the filtered and aggregated results. The number of rows options depend on the type of data source you are extracting from.

    Notes:

    • Not all data sources support sampling. Therefore, you might not see the Sampling option in the Extract Data dialog box.

    • Any fields that you hide first in the Data Source page or on the sheet tab will be excluded from the extract.

Limitations

  • You can't create extracts for embedded data sources that reference published data sources. As a workaround, create the extract directly on the published data source. For more information, see Extract a Published Data Source on Content Server.
  • You can't create extracts for file-based data sources. File-based data sources already have special performance features, and adding extraction will have no performance benefit.
  • This feature does not apply to bridge-based data sources in Tableau Online.

Create extracts in Content Server

Extract a Published Data Source on Content Server

Data sources page showing the extract option in the menu

To extract a published data source:

  1. Sign in as an administrator or as the owner of the data source.
  2. On the Content tab, select Explore > Data sources.
  3. Select a data source by clicking on the Data Source name.
  4. At the top of the screen, under the Data Source name, select the drop-down menu that says Live.
  5. Change the connection type from Live to Extract. If the extract encryption at rest feature is enabled on the site, select either Encrypted or Unencrypted.
  6. If you see an error message about embedded credentials, embed your credentials in the data source. To do this, click Edit Connection. Select "Embedded password in connection” and then click Save.

Extract an Embedded Data Source on Content Server

The extract option in the actions menu on the data sources page

To extract one or more data sources that are embedded in a published workbook:

  1. Sign in as an administrator or as the owner of the data source.
  2. Navigate to the published workbook.
  3. Navigate to the Data Sources tab
  4. Select one or more of the data sources.
  5. Click the Action button.
  6. Click Extract. If the extract encryption at rest feature is enabled on the site, select either Encrypted or Unencrypted.

Limitations

  • Your connection credentials must be embedded in the data source.
  • In the web, you can't specify extract settings like incremental refresh and extract filters.
  • You can't create extracts for embedded data sources that reference published data sources. As a workaround, create the extract directly on the published data source.
  • You can't create extracts for file-based data sources. File-based data sources already have special performance features, and adding extraction will have no performance benefit.
  • This feature does not apply to bridge-based data sources in Tableau Online.

Keep Extracted Data Fresh

After data is extracted, you can optionally set up an extract refresh schedule to keep the data fresh. For more information, see Refresh Data on a Schedule.

Monitor and Manage Extracts

Server administrators can monitor extract creation on the Background Tasks for Extracts admin view. For more information, see Background Tasks for Extracts.

Server administrators can manage extracts on the Jobs page. For more information, see Managing Background Jobs in Tableau Server.

Extract creation jobs, like extract refresh jobs, have a maximum query limit before they timeout. This is to prevent jobs from running forever and using an unbounded amount of server resources. The extract query limit timeout can be configured by server admins using the TSM command line interface configuration setting backgrounder.querylimit. For more information, see tsm configuration set Options.

Server administrators can manage web authoring. For more information, see Set a Site’s Web Authoring Access and Functions.

Thanks for your feedback!