Refresh Flow Data Using Incremental Refresh
Note: Starting in version 2020.4.1, you can now create and edit flows in Tableau Server and Tableau Cloud. The content in this topic applies to all platforms, unless specifically noted. For more information about authoring flows on the web, see Tableau Prep on the Web(Link opens in a new window) in the Tableau Server help.
Starting in Tableau Prep Builder version 2020.2.1 and on the web, you can configure your flow inputs and outputs to refresh incrementally so that only the new rows are retrieved and processed when the flow runs, saving you time and resources.
For example, if your flow includes transaction data that updates daily, you can set up incremental refresh to retrieve and process only the new transactions every day, then run a full refresh weekly or monthly to refresh all of your flow data.
Note: To run incremental refresh on flow inputs that use the Salesforce connector, you must be using Tableau Prep Builder version 2021.1.2 or later. Incremental refresh is not currently supported when writing flow outputs to Microsoft Excel or CRM Analytics.
To run your flow using incremental refresh, Tableau Prep needs the following information:
- The field that detects new rows in the input table.
- The field to use to compare the last processed values in the flow output with the values in the input to determine which rows are new. For more information, see Incremental refresh with Append.
- How you want to write the new data to your tables. You can either add new data to your existing tables, overwrite your table data with the new data, or starting in Tableau Prep Builder version 2020.3.1 and on the web, replace data in an existing table.
Tableau Prep enables you to select how your data is refreshed and how your tables are updated with the flow output. The following table describes the different options and their benefits.
|Refresh Combination||Data Processed||Table Update||Benefits|
|Full Refresh + Create Table||All||Create or overwrite the existing table with the full data set.||
Refresh all the data on every flow run.
|Full Refresh + Append to Table||All||Add new rows to the existing table.||Keep track of both new and existing data on every flow run. Append to table isn't available for .csv output types.|
|Full Refresh + Replace data||All||Replace rows in the existing table.||Maintain your existing table schema structure but replace all the data with every flow run.|
|Incremental Refresh + Create Table||New rows only||Create or overwrite the existing table with only the new rows.||
Create a new table with only the new rows as the complete data set.
If the output data source doesn't exist or can't be connected to when the flow runs, the flow will fail. A full refresh is necessary to create the output before it can be used incrementally.
|Incremental Refresh + Append to Table||New rows only||Add the new rows to the existing table.||
Add only the new rows to the existing table. Append to table isn't available for .csv output types. See Incremental refresh with Append.
|Incremental Refresh + Replace data||New rows only||Replace all rows in the existing table with only the new rows.||Maintain your existing table schema structure, but replace all the data with only the new rows, making this your complete data set.|
Configure incremental refresh
To configure your flow to use incremental refresh, you need to specify settings on both the Input steps and the Output steps where you want to use this option. In the Input step, specify how Tableau Prep will find your new rows. In the Output step, specify how the new rows are written to your table. When you run the flow, you can select either a full or incremental refresh type.
Tip: After you configure your input and output steps for incremental refresh, you can preserve your configurations and reuse them. Copy and paste the steps to use them elsewhere in your current flow or in Tableau Prep Builder, use Save Steps as Flow to save the selected steps to a local file or to your server to reuse the steps in other flows. For more information about copying, pasting or reusing steps, see Copy steps, actions and fields.
- In the flow pane, select the input step that you want to configure for incremental refresh.
- In the Input pane on the Settings tab, under the Incremental Refresh (Set up Incremental Refresh section in prior versions), set the following options:
Select Enable incremental refresh (Enable in prior versions).
Input field (Identify new rows using field in prior versions): Select the field that you want to refresh in your input data. This field must be assigned a data type of Number (whole), Date, or Date & Time. Currently, you can only select a single field.
Note: You can remove or rename this field later in the flow, as long as the field you specify in the Output field (Field name in output in prior versions) can be used to compare this field with the latest output to find new rows.
Output: Select the output that is related to your input and that includes the field that will be used to compare rows.
Output field (Field name in output in prior versions): Select the field to use to compare the last processed values in the flow output with the values in the input to find new rows. This field must have the same data type as the field you specified in the Input field (Identify new rows using field in prior versions).
Incremental refresh first searches for the existing maximum value of the incremental field in the output. It then filters the rows from the input to add only rows with a value larger in the incremental field. For example:
Col1 Col2 ID 5 Row 5
Append new rows to the table based on col1:
Col1 Col2 ID 1 NewRow1 ID 6 NewRow6
- NewRow1 is not added.
- NewRow6 is added.
To finish setting up incremental refresh, set your output Write Options to specify how the new rows are written to your tables. All outputs that are related to the configured input step have a default write option selected, but you can change it to a supported option.
You can output your rows to a file (Tableau Prep Builder only), a published data source or a database. By default, outputs to local or published .hyper extracts are set to Append to table. Outputs to .csv file types are set to Create table.
In the flow pane, select the output step that you want to configure for incremental refresh.
In the Output pane, in the Write Options section, view the default write option and make any changes as needed.
- Create table: This option creates a new table or replaces the existing table with the new output.
- Append to table: This option adds the new data to your existing table. If the table doesn't already exist, a new table is created when the flow is first run and subsequent runs will add new rows to this table. Not available for .csv output types. For more information about supported refresh combinations, see Flow refresh options
- Replace data (Tableau Prep Builder version 2020.3.1 and later and on the web): This option is available when you want to write your output back to an existing table in a database. It replaces the data in the database table with the flow data, but maintains the table schema structure.
You can run individual flows using incremental refresh in Tableau Prep Builder, on the web, or from the command line. For information about running your flow from the command line, see Run the flow with incremental refresh enabled.
If you have Data Management with Tableau Prep Conductor enabled, you can run your flow using incremental refresh using a schedule on Tableau Server or Tableau Cloud. For information about running your flow on a schedule, see Schedule Flow Tasks(Link opens in a new window) in the Tableau Server help.
Note: In prior version, write options are set in Tableau Prep Builder and can't be changed when running your flow in Tableau Server or Tableau Cloud. Starting in Tableau Server and Tableau Cloud version 2020.4, you can edit the flow directly in the web. For more information about using Tableau Prep On the web see see Tableau Prep on the Web(Link opens in a new window) in the Tableau Server help.
Tableau Prep runs a full refresh for all outputs regardless of the run option you select if no existing output is found. Subsequent flow runs use the incremental refresh process and retrieve and process only the new rows unless incremental refresh configuration data is missing or the existing output is removed.
From the top menu, click the drop-down option on the Run button.
From the Output pane, click the drop-down option on the Run Flow button.
From the Flow pane, click the drop-down on the Run button next to the Output step.
If one input with incremental refresh enabled is associated with multiple outputs, those outputs must be run together and must use the same refresh type. When you run your refresh in Tableau Prep, a dialog shows letting you know that you must run both outputs together.