Enable Tableau Catalog

Tableau Catalog discovers and indexes all of the content on your Tableau Online site or Tableau Server, including workbooks, data sources, sheets, and flows. Indexing is used to gather information about the content, or metadata, about the schema and lineage of the content. Then from the metadata, Catalog identifies all of the databases, files, and tables used by the content on your Tableau Online site or Tableau Server.

Catalog is available with the Data Management Add-on. For more information, see Use the Data Management Add-on.

In addition to Catalog, metadata about your content can also be accessed from both the Tableau Metadata API and the Tableau Server REST API using Metadata Methods.

Catalog on Tableau Online

Catalog is automatically enabled when Tableau Online is licensed with the Data Management Add-on.

After your Tableau Online site has been licensed with the Data Management Add-on, the content that already exists on your Tableau Online site is immediately indexed. The time it takes to index the content depends on the amount of content you have. After the content is initially indexed, Catalog monitors newly published content and other changes to assets and continues to index in the background.

Catalog on Tableau Server

As a Tableau Server admin, there are a few things that you need to consider before and while enabling Catalog to ensure optimal performance of Catalog in your Tableau Server environment.

Before enabling Catalog

When Catalog is turned on for the first time, the content that already exists on your Tableau Server is immediately indexed. Catalog uses the non-interactive microservices container to index all the content on Tableau Server. The indexing process is comprised of two primary parts: initial ingestion and event monitoring.

The time it takes Catalog to index the content depends on a couple of factors:

  • Amount of content on Tableau Server: The amount of content is measured by the total number of workbooks, published data sources, and flows published to Tableau Server.
  • Resources available to the non-interactive container: Specifically, the number of threads, processes, and memory available to support the non-interactive microservices container. For information about the non-interactive microservices container, see the Tableau Server Microservice Container topic.

After understanding the factors that impact Catalog ingestion, there might be some adjustments you need to make before enabling Catalog. The recommended process and adjustments to support enabling Catalog are described below.

Step 1: Determine the amount of content on Tableau Server

To determine the amount of content on Tableau Server, do the following:

  1. Sign in to Tableau Server using your admin credentials.
  2. Go to the Explore page.
  3. Click the Top-Level Project drop-down menu and add the numbers next to All Workbooks, All Data Sources, and All Flows together. This is the total amount of content on Tableau Server.

Step 2: Estimate how long ingestion will take

To estimate the time it will take Catalog to ingest content on Tableau Server for the first time, compare your Tableau Server setup to a baseline Tableau Server setup.

For a Tableau Server with the following setup, initial ingestion could take about 6 hours to complete.

Components Baseline values
Content 12,000 workbooks, published data sources, and flows
Threads

2

JVM heap size (memory) 64 MB (minimum) - 4 GB (maximum)
Ingestion ~6 hours

If you have roughly half the content in your Tableau Server environment, initial ingestion might take half the time to complete.

For example: 6,000 (workbooks, published data sources, and flows) + 2 (threads) = ~3 hours (ingestion)

If you have roughly double the content in your Tableau Server environment, initial ingestion might take double the time to complete.

For example: 24,000 (workbooks, published data sources, and flows) + 2 (threads) = ~12 hours (ingestion)

Step 3 (optional): Decrease the time of ingestion and increase memory

If you want to increase the speed of initial ingestion and therefore decrease the time of initial ingestion, you can increase the number of threads allocated to the non-interactive microservices container. When you increase the number of threads, memory must be increased as well.

Threads

As a general rule, the time it takes for Catalog to perform initial ingestion is correlated to the number of allocated threads. Therefore, if you want to reduce the estimated time of ingestion by half, double the allocated threads.

To increase threads, use the tsm configuration set – k graphletingestor.providerEventIngestorClient.connectionPool.maxConnectionPerInstance command. For more information, see the tsm configuration and tsm configuration set Options topics.

Memory

When increasing the number of threads allocated to ingestion, you should also increase the JVM heap size (memory) to support the services associated with Catalog. Tableau recommends adding no more than 2 GB of memory per additional thread count added.

To increase memory, use the tsm configuration set -k noninteractive.vmopts command. For more information, see the tsm configuration and tsm configuration set Options topics.

Important: Before increasing threads and memory, consider the following:

  • The recommendations above are for total number of threads for all nodes, not per node or per instance of the non-interactive container.
  • Tableau recommends that you progressively increase thread count by only 2 threads at a time to avoid issues with CPU utilization of the Tableau Sever repository (PostgreSQL database) while closely monitoring your Tableau Server environment.
  • Be aware that when too many threads are allocated to initial ingestion, CPU utilization of the PostgreSQL database might spike and failover. Symptoms to watch for include SQLException errors in the vizportal logs. For more information, see Repository Failover topic.

Step 4: Activate the Data Management Add-on

If not already done, after the necessary adjustments have been made to support optimal Catalog ingestion, you can activate the Data Management Add-on. For more information, see License the Data Management Add-on.

Step 5 (optional): Turn off Catalog capabilities

As part of the Data Management Add-on activation, Catalog capabilities are turned on by default. Because of the indexing process and the estimated time it takes to complete, you might consider temporarily turning off Catalog so that Tableau Server users can't access Catalog capabilities until they are ready and able to provide complete and accurate results.

To turn off Catalog, follow the procedure below.

  1. Sign in to Tableau Server using your Tableau Server admin credentials.
  2. From the left navigation pane, click Settings.
  3. On the General tab, under Tableau Catalog, clear the Turn on Tableau Catalog check box.

Enabling Catalog

To enable Catalog on Tableau Server, follow the procedures described below.

Step 1: Run the tsm maintenance metadata-services enable command

Run the tsm maintenance metadata-services enable command to enable the Tableau Metadata API. After the Metadata API is enabled, Catalog is automatically turned on when Tableau Server is licensed with the Data Management Add-on. For more information about running the tsm command, see tsm maintenance.

  1. Open a command prompt as an admin on the initial node (where TSM is installed) in the cluster.
  2. Run the command: tsm maintenance metadata-services enable

Notes: When running this command, keep the following points in mind:

  • This command stops and starts some services used by Tableau Server, which causes certain functionality, such as the Recommendations capability, to be temporarily unavailable.
  • A new index of metadata is created at this time. Running this command any subsequent times will create and replace the previous index.

Step 2: Monitor ingestion progress

To ensure that the initial ingestion process is going smoothly, you can monitor its progress through Tableau Server using the set of procedures below.

Step 1: Get the port number of the non-interactive microservices container

  1. Open a command prompt as admin on the initial node (where TSM in installed) in the cluster.
  2. Run the following command to get the port number for the non-interactive microservices container: tsm topology list-ports
  3. In the results, find noninteractive:primary and make note of the port number in the last column. The port number will be used in Step 3 section below.

Step 2: Get authentication cookies from browser

  1. Open a browser like Google Chrome and sign in to Tableau Server using your Tableau Server admin credentials.
  2. Using the Developer tools option (or something similar), go to the Cookies section, and make note of the values for the following cookies: XSRF-TOKEN and workgroup_session_id.

    Do not close the browser window.

Step 3: Retrieve ingestion status

Using the same browser window in Step 2 above, copy the following URI and paste it into the browser's address bar:

http://<your-server>:<port>/relationship-service-war/control/secondaryIndexing/shortcutBackfillComplete

  • Replace <your-server> with your Tableau Server computer name
  • Replace <port> with the port number you noted at the end of Step 1 above

For example:

http://10.100.0.0/relationship-service-war/control/secondaryIndexing/shortcutBackfillComplete

Note: Alternatively, you can retrieve ingestion status using Postman by forming an HTTP GET request using the URI above and the following required keys in the request header:

Key Value
Cookie XSRF-TOKEN=<cookie-value>; workgroup_session_id=<cookie-value>
Content-Type application/json

Step 4: Review results for ingestion status

The endpoint from Step 3 above returns true or false.

  • true indicates ingestion is complete.
  • false indicates ingestion is still in progress.

Step 5 (optional): Retrieve ingestion status by content and asset type

Using the same browser window in Step 2 above, copy the following URI and paste it into the browser's address bar: 

http://<your-server>:<port>/relationship-service-war/control/backfill/status

  • Replace <your-server> with your Tableau Server computer name
  • Replace <port> with the port number you noted at the end of Step 1 above

For example: 

http://10.100.0.0/relationship-service-war/control/backfill/status

Step 6 (optional): Review results for ingestion status by content and asset type

The endpoint from Step 5 (optional) above returns a JSON blob that delineates ingestion status by content and asset type. When reviewing the results, note the following:

  • A backfillComplete status that shows true indicates initial ingestion is complete
  • A backfillComplete status that shows false indicates initial ingestion is still in progress

For example:

[
{"type":"PublishedDatasource","currentId":{"contentId":null,"pageToken":null},"processedCount":0,"durationSeconds":0,"backfillComplete":true},
{"type":"Database","currentId":{"contentId":null,"pageToken":null},"processedCount":0,"durationSeconds":0,"backfillComplete":true},
{"type":"DatabaseTable","currentId":{"contentId":null,"pageToken":null},"processedCount":0,"durationSeconds":0,"backfillComplete":true},
{"type":"Workbook","currentId":{"contentId":null,"pageToken":null},"processedCount":0,"durationSeconds":0,"backfillComplete":true},
{"type":"Flow","currentId":{"contentId":null,"pageToken":null},"processedCount":0,"durationSeconds":0,"backfillComplete":true}
]

Step 3: Configure SMTP Setup

If not already set up for Tableau Server, configure SMTP Setup. SMTP supports sending email to owners who need to be contacted about changes to data. For more information about configuring SMTP, see Configure SMTP Setup.

Step 4 (optional): Turn on Catalog capabilities

If you turned off Catalog capabilities before enabling Catalog in one of the procedures above, you must turn on Catalog to make its capabilities accessible to your users.

To turn on Catalog, follow the procedure below.

  1. Sign in to Tableau Server using your Tableau Server admin credentials.
  2. From the left navigation pane, click Settings.
  3. On the General tab, under Tableau Catalog, clear the Turn on Tableau Catalog check box.

Troubleshoot Catalog or the Metadata API

Timeout limit and node limit exceeded messages

To ensure that Catalog tasks or Metadata API queries that have to return a large number of results don’t take up all Tableau Sever system resources, Catalog implements both timeout and node limits.

Timeout limit

When tasks in Catalog or queries in the Metadata API reach the timeout limit, you and your users see the following message:

Showing partial results, Request time limit exceeded. Try again later.” or TIME_LIMIT_EXCEEDED

To resolve this issue, as a Tableau Server admin, you can increase the timeout limit using the tsm configuration set –k metadata.query.limits.time command. For more information, see the tsm configuration and tsm configuration set Options topics.

Important: Increasing the timeout limit can utilize more CPU for longer, which can the performance of other processes on Tableau Server.

Node limit

When tasks in Catalog or queries in the Metadata API reach the node limit, you and your users see the following message:

NODE_LIMIT_EXCEEDED

To resolve this issue, as a Tableau Server admin, you can increase the node limit using the tsm configuration set –k metadata.query.limits.count command. For more information, see the tsm configuration and tsm configuration set Options topics.

Important: Increasing the timeout limit can affect system memory.

Indexing performance

If Catalog is taking longer than expected to index the content on Tableau Server, consider increasing the number of threads allocated to the non-interactive microservices container. You can temporarily disable the Metadata API by running the tsm maintenance metadata-services disable command and then following the process described in Enable Tableau Catalog section above.

Catalog is automatically enabled when Tableau Online is licensed with the Data Management Add-on.

After your Tableau Online site has been licensed with the Data Management Add-on, the content that already exists on your Tableau Online site is immediately indexed. The time it takes to index the content depends on the amount of content you have. After the content is initially indexed, Catalog monitors newly published content and other changes to assets and continues to index in the background.

Disable Tableau Catalog

As a Tableau Server admin or Tableau Online site admin, you can turn off Catalog capabilities at any time. When Catalog is turned off, the features of Catalog, such as adding data quality warnings or the ability to explicitly manage permissions to database and table assets, are not accessible through Tableau Online or Tableau Server itself. However, Catalog continues to index published content and the metadata is accessible from the Tableau Metadata API and metadata methods in the Tableau Server REST API.

  1. Sign in to Tableau Online as a site admin or Tableau Server as a server admin.

  2. From the left navigation pane, click Settings.

  3. On the General tab, under Tableau Catalog, clear the Turn on Tableau Catalog check box.

Stop indexing metadata on Tableau Server

To stop indexing the published content on Tableau Server, as a Tableau Server admin, you can disable the Tableau Metadata API. To disable the Metadata API, run the tsm maintenance metadata-services disable command. For more information, see tsm maintenance.

Note: Indexing cannot be stopped for a Tableau Online site.

Thanks for your feedback! There was an error submitting your feedback. Try again or send us a message.