Add More Data in the Input Step

After you connect to your data sources and begin to build your flow you may want to refresh your data connections as new data comes in. You can also join or union data sets in the input step to make working with larger data sources more efficient.

Refresh data in the Input step

If data changes in your input files or tables after you begin working with your flow, you can refresh the Input step to bring in the new data.

File input step types

To refresh file inputs steps, do one of the following:

  • In the flow pane on the top menu, click the Refresh button to refresh all Input steps. To refresh a single Input step, click the drop-down arrow next to the refresh button and select the Input step from the list.

  • In the flow pane, right-click the Input step you want to refresh and select Refresh from the menu.

File, database or Tableau extract input step types

To refresh database or tableau extract input steps, do one of the following:

  • Try editing the connection.

    Note: To maintain performance, Tableau Prep Builder samples large data sets. If your data is sampled, you may or may not see your new data in the profile pane. You can change the settings for how your data is sampled in the Data Sample tab in the Input step, but it may impact performance. For more information about setting your data sample size, see Set your data sample size.

    1. In the Connections pane, right-click or Ctrl-click (MacOS) on the data source and select Edit.

    2. Re-establish your connection by signing into the database or re-selecting the file or Tableau extract.

  • Remove and re-add the Input step to the flow.

    1. In the flow pane, right-click the Input step you want to refresh and select Remove from the menu.

      This will temporarily put your flow in an error state.

    2. Connect to the updated file again.

    3. Drag the table to the flow pane on top of the second step in the flow where you want to add the Input step. Drop it on the Add option to reconnect it to the flow.

Union files and database tables in the Input step

When working with multiple files or database tables from a single data source, you can search for files or tables using a wildcard search and then union the data to include all of the file or table data in the Input step. To union files, the files must be in the same parent or child directory.

New files that are added to the same folder that match the pattern are automatically included in the union the next time you open the flow or run it from the command line. Packaged flow files (.tflx) won't automatically pick up new files because the files are already packaged with the flow. To include new files for packaged flows, open the flow file (.tfl) to pick up the new files, then repackage the flow to include the new file data.

To union database tables, the tables must be in the same database and the database connection must support using a wildcard search to union. The following databases support this type of union:

  • Amazon Redshift

  • Microsoft SQL Server

  • MySQL

  • Oracle

  • PostgreSQL

If you add or remove files or tables after you create the union you can refresh the Input step to update your flow with the new or changed data.

Note: Currently, this feature applies only to Excel and .csv (text) files and data tables stored in the specific databases listed above. This option is not available for Tableau data extracts.

Wildcard union for files is available in Tableau Prep Builder version 2018.1.2 and later. Wildcard union for database tables is available in Tableau Prep Builder version 2018.3.1 and later. Editing a flow connection with this type of union in a prior version can result in errors.

If you need to union data from different data sources, you can do that using a Union step. For more information about creating Union steps, see Union your data.

Union files

By default, Tableau Prep Builder unions all .csv files in the same directory as the .csv file you connected to or all the sheets in the Excel file you connected to. If you use Data Interpreter to clean Excel files and are using Tableau Prep Builder version 2018.1.2 or later, you can use the wildcard search to union and add any sub-tables that Data Interpreter found.

If you want to change the default union, use the following criteria to find the files or sheets you want to include in the union:

  • Search in: Select the directory to use to search for files. Select the Include subfolders check box to include files in the sub-directory of the parent folder.

  • Files: Select whether to include or exclude the files that match the wildcard search criteria.

  • Matching Pattern (xxx*): Enter a wildcard search pattern to find files that have those characters in the file name. For example, if you enter ord* all files that include the file name are returned. Leave this field blank to include all of the files in the specified directory.

To union files in the input step, do the following:

  1. Click the Add connection button and under Connect, click Text File for .csv files or Microsoft Excel for Excel files, and then select a file to open.

  2. In the Input pane, select the Multiple Files tab, and then select Wildcard union.

    The example below shows a wildcard union using a matching pattern. The plus sign on the file icon on the Orders_Central Input step in the Flow pane indicates that this step includes a wildcard union. The files in the union are listed under Included files.

  3. Use the search, file and matching pattern options to find the files that you want to union.

  4. Click Apply to union the files.

When you add a new step to the flow, you can see all the files added to the data set in the File Paths field in the Profile pane. This field is added automatically.

Union database tables (version 2018.3.1 and later)

  1. Click the Add connection button and under Connect,connect to a database that supports wildcard union.

  2. Drag a table to the flow pane.

  3. In the Input pane, select the Multiple Tables tab, and then select Wildcard union.

  4. Use search, Tables and Matching Pattern options to find the tables that you want to union.

    Only tables that display in the Connections pane in the Tables section can be included in the union. Wildcard search doesn't search across schemas or across the database connection to find tables.

  5. Click Apply to union the table data.

    When you add a new step to the flow, you can see all the tables added to the data set in the Table Names field in the Profile pane. This field is added automatically.

Merge fields after a union

After you create a union in the input step, you might want to merge fields. You can do this in any subsequent step, except for the Input or Output steps. For more information, see Additional merge field options.

Join data in the Input step (version 2019.1.3 and later)

When you connect to databases that include tables with relationship data, Tableau Prep Builder can detect and show which fields in a table are identified as the unique identifier and which fields are identified as a related field as well as show the related table names for these fields.

A new column called Linked Keys shows in the Input pane and shows the following relationships if they exist:

  • Unique identifier. This field uniquely identifies each row in the table. There can be multiple unique identifiers in a table. The values in the fields must be unique and cannot be blank or null.

  • Related field. This field relates the table to another table in the database. There can be multiple related fields in a table.

  • Both Unique Identifier and related field. The field is a unique identifier in this table and also relates the table to another table in the database.

You can leverage these relationships to quickly find and add the related tables to your flow or create joins from the Input step. This feature is available for any supported database connector where table relationships are defined.

  1. Connect to a database (such as Microsoft SQL Server) that contains relationship data for fields, such as unique identifiers or related fields (foreign key).
  2. In the Input pane, click on a field that is marked as a related field or as both a unique identifier and related field.

    A dialog opens that shows a list of related tables.

  3. Hover on the table that you want to add or join and click the plus button to add the table to your flow, or click the join button to create a join with the selected table.

    If you create a join, Tableau Prep Builder uses the defined field relationship to join the tables and shows you a preview of the join clauses that it will use to create the join.

  4. Alternatively, you can join related tables from the menu in the Flow pane. Hover over a step until the plus icon appears, then select Add Join to see a list of related tables. Tableau Prep Builder creates the join based on the fields that make up the relationship between the two tables.

    Note: If your table doesn't have table relationships defined, this option is not available.

For more information about working with joins, see Join your data.

Thanks for your feedback! There was an error submitting your feedback. Try again or send us a message.