Use Data Roles to Validate your Data

Use data roles to quickly identify whether the values in a field are valid or not. Tableau Prep Builder delivers a standard set of data roles that you can select from or you can create your own using the unique field values in your data set.

When you assign a data role, Tableau Prep Builder compares the standard values defined for the data role with the values in your field. Any values that don't match are marked with a red exclamation mark. You can filter your field to view only the valid or invalid values and take the appropriate actions to fix them. Once you've assigned a data role to your fields, you can use the Group and Replace option to group and match invalid values to valid ones based on spelling and pronunciation.

Assign standard data roles to your data

Assign data roles provided by Tableau Prep Builder to your field the same way you assign a data type. The data role identifies what your data values represent so Tableau Prep Builder can automatically validate values and highlight ones that aren't valid for that role.

For example if you have field values for geographical data, you can assign a data role of City and Tableau Prep Builder compares the values in the field to a set of known domain values to identify values that don't match.

Note: Each field is analyzed independently so a City value of "Portland" in State "Washington" in Country "USA" might not be a valid city and state combination, but it won't be identified that way because it is a valid city name.

Tableau Prep Builder provides the following data roles:

  • Email

  • URL

  • Geographic roles (Based on current geographic data and is the same data used by Tableau Desktop)

    • Airport
    • Area code (U.S.)
    • CBSA/MSA
    • City
    • Congressional District (U.S.)
    • Country/Region
    • County
    • NUTS Europe
    • State/Province
    • Zip code/Postal code

Tip: In Tableau Prep Builder version 2019.1.4 and later, if you assign a geographic role to a field, you can also use that data role to match and group values with the standard value defined by your data role. For more information about grouping values using data roles, see Clean and Shape Data.

To assign a data role to a field, do the following:

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select the data role for the field.

    Tableau Prep Builder compares the field's data values to known domain values or patterns (for email or URL) for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations.

Create custom data roles (version 2019.3.1 and later)

You can create your own custom data roles using the field values in your data sets to create a standard set of values that you or others can use to validate your field values when cleaning data. Select the field that you want to use, apply any cleaning operations to it if needed, then, publish it to Tableau Server or Tableau Online to use it in your flow or share your data roles with others.

Before creating a custom data role, review the following:

  • You can create custom data roles from single fields in your data set. Creating custom data roles from a combination of fields isn't supported.
  • You can create custom data roles only for fields assigned to a data type of String and Number (whole).
  • When you create a custom data role Tableau Prep Builder creates an output step in your flow that is specific to publishing the data role.
  • Publishing custom data roles to multiple sites in the same flow isn't supported. If you publish the flow, you must publish the custom data role to the same site or server where the flow is published.
  • Custom data roles are specific to the site, server and project where you publish them. All users with permissions to the location can use the custom data role, but must be signed in to the site or server to select it or apply it. Custom data roles are assigned the default permission for the All Users group to the All Users permissions for new projects instead of None.
  • Custom data roles aren't version specific. When applying a custom data role, the most current version is applied.
  • Once published to Tableau Server or Tableau Online user with access to the site, server and project can view all data roles in that location.
    • Users with appropriate permissions can move, delete or edit permissions for the data roles.
    • The permissions you can set and actions you can take on a custom data role are similar to what you can do with a flow. For more information, see Manage a Flow. For more information on setting permissions, see Permission capabilities in the Tableau Server help.
  • To edit a data role, you must make your changes in Tableau Prep Builder, then republish the data role using the same name to overwrite it, as you would to edit a published data source.

Create a custom data role

  1. In the Profile pane, data grid, or Results pane select the field you want to use to create a custom data role.

  2. Click More options for the field, and select Publish as Data Role.

  3. Select the server and project where you want to publish the data role.

  4. Click Run Flow to create the data role. After the publishing process completes successfully, you can view your data role in Tableau Server or Tableau Online. Processing the data role can take some time based on the load on your Tableau Server or Tableau Online site. If your data role isn't available right away, wait a few minutes, then try selecting it again.

Apply a custom data role

  1. In the Profile pane, Results pane or data grid, click the data type for the field where you want to apply the custom data role.

  2. Select Custom then select the data role that you want to apply to the field.

    Important: Make sure you are signed into the site or server where the data role was published or you won't see this option.

    Tableau Prep Builder compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations.

View and manage custom data roles

You can view and manage your published custom data roles on Tableau Server and Tableau Online. You can view all custom data roles published to your site or server. Click More actions for a selected data role to move it to a different project, change permissions or delete it.

Group similar values by data role

Note: In Tableau Prep Builder version 2019.1.4 and 2019.2.1 this option was labeled Data Role Matches.

If you assign a geographic data role to a field you can use the values in the data role to group and match values in your data field based on spelling and pronunciation to standardize them. In Tableau Prep Builder version 2019.2.3, you can use either Spelling or Spelling + Pronunciation to group and match invalid values to valid ones.

These options uses the standard value defined by the data role. If the standard value isn't in your data set sample, Tableau Prep Builder adds it automatically and marks the value as not in the original data set. For more information about assigning data roles to fields, see Assign standard data roles to your data.

To use data roles to group values, complete the following steps.

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select one of the following data roles for the field:

    • Airport
    • City
    • Country/Region
    • County
    • State/Province

    In Tableau Prep Builder version 2019.3.2 you can also select from your custom data roles

    Standard data roles (version 2019.1.4 and later) Custom data roles (version 2019.3.2 and later)

    Tableau Prep Builder compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click More options, select Group and Replace , then select one of the following options:

    • Spelling: Matches invalid values to the closest valid values that differ by adding, removing, or substituting characters.
    • Pronunciation + Spelling: Matches invalid values to the most similar valid value based on spelling and pronunciation.

      Note: In Tableau Prep Builder version 2019.1.4 or 2019.2.1, this option was called Data Role Matches.

    You can also click on the Recommendationsicon on the field to apply the recommendation to group and replace the invalid values with valid ones. This option uses the Pronunciation + Spelling Group and Replace option.

    Tableau Prep Builder compares the values by spelling or spelling and pronunciation and then groups similar values under the standardized value for the data role. If the standardized value isn't in your data set, the value is added and marked with a red dot.

Thanks for your feedback! There was an error submitting your feedback. Try again or send us a message.