Use Data Roles to Validate your Data

Note: Data source owners and Tableau administrators can add synonyms for specific data field names and values for Ask Data. For information about using data roles for Ask Data, see Add Synonyms for Ask Data(Link opens in a new window) in the Tableau Desktop help.

Use data roles to quickly identify whether the values in a field are valid or not. Tableau Prep delivers a standard set of data roles that you can select from or you can create your own using the unique field values in your data set.

When you assign a data role, Tableau Prep compares the standard values defined for the data role with the values in your field. Any values that don't match are marked with a red exclamation mark. You can filter your field to view only the valid or invalid values and take the appropriate actions to fix them. Once you've assigned a data role to your fields, you can use the Group Values option to group and match invalid values to valid ones based on spelling and pronunciation.

Note: Starting in version 2020.4.1, you can now create and edit flows in Tableau Server and Tableau Online. The content in this topic applies to all platforms, unless specifically noted. For more information about authoring flows on the web, see Tableau Prep on the Web.

Assign standard data roles to your data

Assign data roles provided by Tableau Prep to your field the same way you assign a data type. The data role identifies what your data values represent so Tableau Prep can automatically validate values and highlight ones that aren't valid for that role.

For example if you have field values for geographical data, you can assign a data role of City and Tableau Prep compares the values in the field to a set of known domain values to identify values that don't match.

Note: Each field is analyzed independently so a City value of "Portland" in State "Washington" in Country "USA" might not be a valid city and state combination, but it won't be identified that way because it is a valid city name.

Tableau Prep Builder provides the following data roles:

  • Email

  • URL

  • Geographic roles (Based on current geographic data and is the same data used by Tableau Desktop)

    • Airport
    • Area code (U.S.)
    • CBSA/MSA
    • City
    • Congressional District (U.S.)
    • Country/Region
    • County
    • NUTS Europe
    • State/Province
    • Zip code/Postal code

Tip: In Tableau Prep Builder version 2019.1.4 and later and on the web, if you assign a geographic role to a field, you can also use that data role to match and group values with the standard value defined by your data role. For more information about grouping values using data roles, see Clean and Shape Data(Link opens in a new window).

To assign a data role to a field, do the following:

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select the data role for the field.

    Tableau Prep compares the field's data values to known domain values or patterns (for email or URL) for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations(Link opens in a new window).

Create custom data roles

Starting in Tableau Prep Builder version 2019.3.1 and on the web, you can create your own custom data roles using the field values in your data sets to create a standard set of values that you or others can then use to validate fields when cleaning data. Select the field that you want to use, apply any cleaning operations to it if needed, then, publish it to Tableau Server or Tableau Online to use it in your flow or share your data roles with others.

If creating custom data roles when editing flows on the web, you can publish the custom data role directly to the server you are signed into.

Requirements

  • You can create custom data roles from single fields in your data set. Creating custom data roles from a combination of fields isn't supported.
  • You can create custom data roles only for fields assigned to a data type of String and Number (whole).
  • When you create a custom data role Tableau Prep creates an output step in your flow that is specific to publishing the data role.
  • Publishing custom data roles to multiple sites in the same flow isn't supported. If you publish the flow, you must publish the custom data role to the same site or server where the flow is published.
  • Custom data roles are specific to the site, server and project where you publish them. All users with permissions to the location can use the custom data role, but must be signed in to the site or server to select it or apply it. Custom data roles are assigned the default permission for the All Users group to the All Users permissions for new projects instead of None.
  • Custom data roles aren't version specific. When applying a custom data role, the most current version is applied.
  • Once published to Tableau Server or Tableau Online user with access to the site, server and project can view all data roles in that location.
  • To edit a data role, you must make your changes in Tableau Prep Builder or in the flow on the web, then republish the data role using the same name to overwrite it. This process is similar to editing a published data source.

Create a custom data role

  1. In the Profile pane, data grid, or Results pane select the field you want to use to create a custom data role.

  2. Click More options for the field, and select Publish as Data Role.

  3. Select the server and project where you want to publish the data role.

  4. Click Run Flow to create the data role. After the publishing process completes successfully, you can view your data role in Tableau Server or Tableau Online. Processing the data role can take some time based on the load on your Tableau Server or Tableau Online site. If your data role isn't available right away, wait a few minutes, then try selecting it again.

Apply a custom data role

  1. In the Profile pane, Results pane or data grid, click the data type for the field where you want to apply the custom data role.

  2. Select Custom then select the data role that you want to apply to the field.

    Important: In Tableau Prep Builder, make sure you are signed into the site or server where the data role was published or you won't see this option.

    Tableau Prep compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations(Link opens in a new window).

View and manage custom data roles

You can view and manage your published custom data roles on Tableau Server and Tableau Online. You can view all custom data roles published to your site or server. Click More actions for a selected data role to move it to a different project, change permissions or delete it.

Group similar values by data role

Note: In Tableau Prep Builder version 2019.1.4 and 2019.2.1 this option was labeled Data Role Matches.

If you assign a geographic data role to a field you can use the values in the data role to group and match values in your data field based on spelling and pronunciation to standardize them. You can use either Spelling or Spelling + Pronunciation to group and match invalid values to valid ones.

These options uses the standard value defined by the data role. If the standard value isn't in your data set sample, Tableau Prep adds it automatically and marks the value as not in the original data set. For more information about assigning data roles to fields, see Assign standard data roles to your data.

To use data roles to group values, complete the following steps.

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select one of the following data roles for the field:

    • Airport
    • City
    • Country/Region
    • County
    • State/Province

    Starting in Tableau Prep Builder version 2019.3.2 and on the web, you can also select from your custom data roles.

    Standard data roles (version 2019.1.4 and later) Custom data roles (version 2019.3.2 and later)

    Tableau Prep compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click More options, select Group Values (Group and Replace in previous versions), then select one of the following options:

    • Spelling: Matches invalid values to the closest valid values that differ by adding, removing, or substituting characters.
    • Pronunciation + Spelling: Matches invalid values to the most similar valid value based on spelling and pronunciation.

    You can also click on the Recommendationsicon on the field to apply the recommendation to group and replace the invalid values with valid ones. This option uses the Pronunciation + Spelling Group Values option.

    Tableau Prep compares the values by spelling or spelling and pronunciation and then groups similar values under the standardized value for the data role. If the standardized value isn't in your data set, the value is added and marked with a red dot.

Thanks for your feedback!