Refresh Flow Data Using Incremental Refresh
Starting in version 2020.2.1, you can configure your flow inputs and outputs to refresh incrementally so that only the new rows are retrieved and processed when the flow runs, saving you time and resources. For example, if your flow includes transaction data that updates daily, you can set up incremental refresh to retrieve and process only the new transactions every day, then run a full refresh weekly or monthly to refresh all of your flow data.
To run your flow using incremental refresh, Tableau Prep Builder needs the following information:
- The field that detects new rows in the input table.
- The field to use to compare the last processed values in the flow output with the values in the input to determine which rows are new.
- How you want to write the new data to your tables. You can either add new data to your existing tables, overwrite your table data with the new data, or starting in version 2020.3.1, replace data in an existing table.
Tableau Prep Builder enables you to select how your data is refreshed and how your tables are updated with the flow output. The following table describes the different options and their benefits.
|Refresh Combination||Data Processed||Table Update||Benefits|
|Full Refresh + Create Table||All||Create or overwrite the existing table with the full data set.||
Refresh all the data on every flow run.
|Full Refresh + Append to Table||All||Add new rows to the existing table.||Keep track of both new and existing data on every flow run. Append to table isn't available for .csv output types.|
|Full Refresh + Replace data (version 2020.3.1 and later)||All||Replace rows in the existing table.||Maintain your existing table schema structure but replace all the data with every flow run.|
|Incremental Refresh + Create Table||New rows only||Create or overwrite the existing table with only the new rows.||
Create a new table with only the new rows as the as the complete data set.
|Incremental Refresh + Append to Table||New rows only||Add the new rows to the existing table.||Add only the new rows to the existing table. Append to table isn't available for .csv output types.|
|Incremental Refresh + Replace data (version 2020.3.1 and later)||New rows only||Replace all rows in the existing table with only the new rows.||Maintain your existing table schema structure, but replace all the data with only the new rows, making this your complete data set.|
Configure incremental refresh
To configure your flow to use incremental refresh, you need to specify settings on both the Input steps and the Output steps where you want to use this option. In the Input step, specify how Tableau Prep Builder will find your new rows. In the Output step, specify how the new rows are written to your table. When you run the flow, you can select either a full or incremental refresh type.
Tip: After you configure your input and output steps for incremental refresh, you can preserve your configurations and reuse them. Copy and paste the steps to use them elsewhere in your current flow or use Save Steps as Flow to save the selected steps to a local file or to your server to reuse the steps in other flows. For more information about copying, pasting or reusing steps, see Copy steps, actions and fields.
- In the flow pane, select the input step that you want to configure for incremental refresh.
- In the Input pane on the Settings tab, under the Incremental Refresh (Set up Incremental Refresh section in prior versions), set the following options:
Select Enable incremental refresh (Enable in prior versions).
Input field (Identify new rows using field) in prior versions: Select the field that you want to refresh in your input data. This field must be assigned a data type of Number (whole), Date, or Date & Time. Currently, you can only select a single field.
Note: You can remove or rename this field later in the flow, as long as the field you specify in the Output field (Field name in output in prior versions) can be used to compare this field with the latest output to find new rows.
Output: Select the output that is related to your input and that includes the field that will be used to compare rows.
Output field (Field name in output in prior versions): Select the field to use to compare the last processed values in the flow output with the values in the input to find new rows. This field must have the same data type as the field you specified in the Input field (Identify new rows using field in prior versions).
To finish setting up incremental refresh, set your output Write Options to specify how the new rows are written to your tables. All outputs that are related to the configured input step have a default write option selected, but you can change it to a supported option. Starting in version 2020.3.1 you can write your flow output to a supported database. For more information, see Save flow output data to external databases (version 2020.3.1 and later).
By default, outputs to local or published .hyper extracts are set to Append to table. Outputs to .csv file types are set to Create table.
In the flow pane, select the output step that you want to configure for incremental refresh.
In the Output pane, in the Write Options section, view the default write option and make any changes as needed.
- Create table: This option creates a new table or replaces the existing table with the new output.
- Append to table: This option adds the new data to your existing table. If the table doesn't already exist, a new table is created when the flow is first run and subsequent runs will add new rows to this table. Not available for .csv output types. For more information about supported refresh combinations, see Flow refresh options
- Replace data (version 2020.3.1 and later): This option is available when you want to write your output back to an existing table in a database. It replaces the data in the database table with the flow data, but maintains the table schema structure.
You can run individual flows using incremental refresh in Tableau Prep Builder or from the command line. For information about running your flow from the command line, see Run the flow with incremental refresh enabled (version 2020.2.1 and later).
If you have Tableau Prep Conductor (part of the Data Management Add-on) enabled on your server, you can run your flow using incremental refresh by either setting up a schedule in Tableau Server or Tableau Online. For information about running your flow on a schedule, see Schedule a Flow Task.
Note: Write options are set in Tableau Prep Builder and can't be changed when running your flow in Tableau Server or Tableau Online.
Tableau Prep Builder runs a full refresh for all outputs regardless of the run option you select if no existing output is found. Subsequent flow runs use the incremental refresh process and retrieve and process only the new rows unless incremental refresh configuration data is missing or the existing output is removed.
From the top menu, click the drop-down option on the Run button.
From the Output pane, click the drop-down option on the Run Flow button.
From the Flow pane, click the drop-down on the Run button next to the Output step.
If one input with incremental refresh enabled is associated with multiple outputs, those outputs must be run together and must use the same refresh type. When you run your refresh in Tableau Prep Builder, a dialog shows letting you know that you must run both outputs together.