Use Lineage for Impact Analysis
Knowing where your data comes from is key to trusting the data, and knowing who else uses it means you can analyze the impact of changes to data in your environment. The lineage feature in Tableau Catalog helps you do both these things.
Lineage requires the Data Management Add-on. Starting in 2019.3, Tableau Catalog is available in the Data Management Add-on to Tableau Online and Tableau Server. When Tableau Catalog is enabled in your environment, you have access to lineage for your data sources, metrics, flows, databases, and tables. For more information about Tableau Catalog, see "About Tableau Catalog" in the Tableau Server(Link opens in a new window) or Tableau Online(Link opens in a new window) Help.
How you navigate to the Lineage pane depends on what kind of asset you're working with.
To see the lineage for Tableau content such as data sources or flows, from Explore, navigate to and open the content asset, and then select the Lineage tab.
To see lineage for external assets, such as databases and tables, from External Assets, navigate to and select an asset from the list. When you select a table, for example, a page opens with information about that table, including the name, type, description, columns, and details about each column. To the right of the table information is the Lineage pane, which shows the lineage for that table.
Lineage shows dependencies in relationship to the lineage anchor, which is the asset selected. A lineage anchor can be a database, table, workbook, published data source, or a flow. (In the first example, the anchor is the CurrentWorkItem data source and in the second example, it's the TestResult table). All the assets below the anchor depend, either directly or indirectly, on the anchor—the outputs or the downstream assets. The assets above the anchor are the assets the anchor is either directly or indirectly dependent on—the inputs or the upstream assets.
When you select a field in a data source or a column in a table, the lineage is filtered to show only downstream assets that depend on the field (or column) or upstream inputs to the field (or column) as in this 'superstore export' workbook example:
You can select an upstream or downstream asset in the Lineage pane to see its details. For example, when you select Metrics, the list of metrics depending on this workbook appears to the left of the Lineage pane.
From the Lineage pane, you can navigate to any asset related to your initial choice, in this case the workbook, by following the links that interest you.
When an external asset (database, table, or file) is embedded in published Tableau content (workbooks, data sources, and flows), the external asset is used by the content, but is not shareable with other users. That embedded external asset appears in the lineage upstream from its Tableau content asset and is listed in External Assets.
To see if an external asset is embedded, go to the external asset’s detail page and see if “Embedded Asset” is listed under Category.
For information about embedded data, see Publishing data separately or embedded in workbooks(Link opens in a new window) in Tableau Desktop and Web Authoring Help.
Lineage and custom SQL connections
When you view the lineage of a connection that uses custom SQL, Catalog doesn't support showing column information. For more information, see Tableau Catalog support for custom SQL(Link opens in a new window) in the Tableau Desktop and Web Authoring Help.
Mismatch between lineage count and tab count
You might notice a mismatch in the count of assets between the Tableau Catalog Lineage tool and the tabs in Tableau Server or Tableau Online.
The count mismatch is explained by the fact that each—lineage count vs. tab count—counts assets a different way. For example, at any given point in time, Catalog can count only assets that are indexed, whereas Tableau Server or Tableau Online counts any assets that are published. Other reasons for count differences include whether:
- You have "View" permissions for the asset.
- An asset is hidden.
- Any fields are used in a workbook.
- An asset is directly or indirectly connected to.
Workbook count mismatch example
As an example, here's how the tab count vs. the lineage count is determined for workbooks.
Connected Workbooks tab counts workbooks that meet both these criteria:
- Connects to the data source (whether or not any fields are actually used in the workbook).
- The user has permissions to view (whether it's a worksheet, dashboard, or story).
Tableau Catalog Lineage counts workbooks that meet all these criteria:
- Has been indexed by Tableau Catalog.
- Connects to the data source and uses at least one field in the data source.
- Contains worksheets, including dashboards or stories that contain a worksheet, that use at least one field in the data source.
When metadata is blocked because of limited permissions, Catalog still counts the workbook. But instead of seeing some of the sensitive metadata, you see Permissions required. For more information, see Access lineage information.
At the end of the lineage is Owners. The list of owners includes anyone assigned as the owner or contact for any content downstream from the lineage anchor.
You can email owners to let them know about changes to the data. (To email owners, you must have the 'Overwrite' (Save) capability on the lineage anchor content.)
- Select Owners to see the list of people who are impacted by the data in this lineage.
- Select the owners you want to send a message to.
- Click Send Email to open the email message box.
- Enter the Subject and your message in the text box, and click Send.