Refresh Extracts
When the original data changes, you can refresh the extract using Desktop by selecting a data source on the Data menu, and then selecting . Extracts can be set up in two ways: fully refreshed or incrementally refreshed.
With a full refresh, all the data in the extract is replaced with the latest information from the original data source. An incremental refresh has two variations: only add the new rows since the last refresh, or define a time period from the refresh date to refresh. The optional second approach addresses the need to update the extract if there is a possibility that already-extracted data may have changed in the data source. If you want more details on how to extract your data, see Extract Your Data.
Note: You can also automate extracting refreshes using the Tableau Data Extract Command-Line Utility.
Before you refresh extracts
If you want to refresh an extract, make note of the file format of the extract before you perform an extract refresh. If you perform a refresh on an .tde
extract using version 2022.4 or newer, the extract is upgraded to a .hyper
extract automatically. While there are many benefits of upgrading to a .hyper
extract, you’ll be unable to open the extract with previous versions of Tableau Desktop. For more information, see Extract Upgrade to .hyper Format.
Note: .tde files are no longer supported in any form after version 2024.2 of Tableau, and they must be upgraded to .hyper format to be used.
Configure a full extract refresh
By default, extracts are configured to fully refresh. This means that every time you refresh the extract, all of the rows are replaced with the data in the original data source. While this kind of refresh ensures that you have an exact copy of what is in the original data, depending on the size of the extract, a full refresh can sometimes take a long time and be expensive on the database. If an extract isn’t configured for an incremental refresh, selecting to refresh the extract will perform a full refresh of the extract.
Publish to Tableau Server
If you plan to publish the extract as a data source to Tableau Server, you can specify a schedule for the extract refresh during publishing.
Similarly, if you’re publishing the extract in a workbook to Tableau Server, you can also specify a schedule for the extract refresh during publishing. For more information, see Schedule Extract Refreshes as You Publish a Workbook.
Publish to Tableau Cloud
If you plan to publish the extract as a data source to Tableau Cloud, your options for refreshing your data depend on the characteristics of your data sources. For more information on refreshing data from specific data sources, see Extract refresh options by data source.
Configure an incremental extract refresh
Most data sources support an incremental refresh. Rather than refreshing the entire extract, you can configure a refresh to add only the rows that are new since the previous time you extracted the data.
For example, you may have a data source that is updated daily with new sales transactions. Rather than rebuild the entire extract each day, you can add the new transactions that occurred that day. Then once a week you might want to do a full refresh just to be sure you have the most up-to-date data.
You can also optionally configure the incremental settings to define a subrange of time to remove all associated data from the existing extract and refresh it completely, picking up any changes or additions since the last refresh in that time range.
Non-Unique column
Starting from version 2024.1, Tableau introduces a feature that enables users to perform incremental refreshes on extracts using a non-unique key column. This update introduces an additional step in the process.
Subrange refresh for incremental extracts
In Tableau versions 2024.2 and newer, you can specify an additional time period to re-extract previously extracted data and capture any changes that may have occurred. This is done with the Minimum date range to refresh fields. This is useful if (for example) the data source allows for inserts and retroactive modifications within a defined time period. Once configured, the incremental extract will pull any new data, plus previously extracted data from a set time range.
After selecting a number (like 2) and time period (such as Month), the helper text below will update to reflect your choices ("The last 2 months of data from the refresh date will always be refreshed"). If no time range is entered, it will only pull content newer than what is in the existing extract.
Understanding subrange refreshes
Configuring subrange refreshes to work best for you is based on how often you plan to refresh your extracts, how often your data source is updated, and the amount of data normally updated during each time period. It’s important to understand how Tableau interprets your extract configuration choices to ensure the extract contains the data you expect.
- When a number of time units are entered, Tableau will interpret it as pulling all data from the existing time unit plus an additional X units worth of data. For example: If you set it to 3 months, it will pull all data from the current month, plus the previous 3 months worth of data. If the current date is April 1st, it would pull all data from April (in this case 1 day) plus all data from January, February, and March, or approximately 91 days of data. Running the same refresh on April 28th would pull data from 28 days of April, plus the preceding 3 months, or approximately 118 days of data.
- For the most granular control and consistency, best practice is to use the smallest applicable unit of time (such as Day) that makes sense for your data. To always refresh the previous 90 days of data plus the refresh day, choose 90 as the number, and Day as the time period.
- Tableau recommends you base your extract off a date/time field that reflects the date/time when a record was entered. If the date field being used to determine extracts includes future dates (such as "Planned order delivery date"), results using the subrange refresh feature may not be what you expect. For example: an incremental refresh is set up to extract the previous two months of data from the refresh date. It’s April 1st, but there are records in the database where this date field contains values up to 6 months into the future, September 1st. When the refresh occurs in this example, only data from April (the current refresh month), and the previous 2 months of data will be extracted. If changes occurred in the data source for rows associated with these future dates beyond April, they will not get updated.
- If there is more new data in the data source to be refreshed than the configured time period, the extra data pulled begins at the point where new data is available. For example, if you configure the subrange for only the previous 2 days, but the last refresh occurred 10 days ago, then it will extract all 10 days of new data plus an additional 2 days of previously extracted data.
Note: If the data structure of the source data changes (for example, a new column is added), you’ll need to do a full extract refresh before you can start doing incremental refreshes again.
Follow these steps to set up an extract to be refreshed incrementally or edit an existing full extract to use incremental refresh.
-
Select a data source on the Data menu, and then select Extract Data.
-
In the Extract Data dialog box, select All rows as the number of Rows to extract. Incremental refresh can only be defined when you’re extracting all rows in the database.
Note: You can't increment a sample extract.
-
Expand the Incremental Refresh section, and check Incremental refresh.
-
Specify a column in the database that will be used to identify new rows.
For example, you can select a Date field or an ID column that monotonically increases as rows are added to the database.
-
(Optional) If you select a Date or Datetime column in Tableau, you can perform a partial refresh of the extract data by specifying a specific date range. Determine the number of days and specify the date/time scale to establish the minimum date range for the refresh (for example, 14 days from the refresh date).
-
(Optional) If no minimum refresh range is established, expand Advanced Settings to establish how new rows are retrieved. You can choose to replace the last rows added by refreshing values that are equal to or greater than the last recorded value, or you can opt to keep the last rows added by only refreshing the extract with values greater than the last recorded value.
-
To finish, select Save Settings.
Note: Incremental Refresh isn't available if you enable Aggregation.
Note: When filters are added, the Advanced Settings for Incremental Refresh will appear grayed out and the default setting of "Don't replace the last rows added" will be enforced. The purpose of this is to avoid any conflicts that may arise between certain filter criteria and the advanced incremental refresh logic that will "Replace the last rows added".
Considerations when refreshing extracts
Editing an extract
If you’re editing an existing extract, the last refresh is shown so you can be sure you’re updating the extract with the correct data.
Full Refresh
A Full Refresh replaces all of the rows with the data in the original data source every time you refresh the extract. A Full Refresh can take longer and be expensive on the database.
Time/Date precision and extracts
The data engine, which is the underlying mechanism that Tableau uses to create extracts, stores time values with a precision of up to 3 decimal places. If you specify a datetime or timestamp column for Identify new rows using column, and your database uses a higher precision than Tableau, you can end up with duplicate rows after an incremental refresh. For example, if the database has two rows, one with a datetime value of 2015-03-13 17:30:56.502352
and one with a datetime value of 2015-03-13 17:30:56.502852
, Tableau will store both rows using a datetime value of 2015-03-13 17:30:56.502
thereby creating duplicate rows.
See extract history (Tableau Desktop)
You can see a history of when the extract was refreshed by selecting a data source on the Data menu and then select .
The Extract History dialog box shows the date and time for each refresh, whether it was full or incremental, and the number of rows that were added. If the refresh was from a file, it also shows the source file name.