Understand Workbook Impact on Your Database Using Query Tagging
Query Tagging Overview
Query tagging enables administrators to trace the specific Tableau activity that sends queries to their remote SQL databases. Query tagging is enabled by default for data connections using Hyper, Snowflake, and Teradata, and can be enabled for most other activities that generate queries.
Common reasons for using query tagging start when a database administrator notices long-running or failed queries. On servers or sites with multiple workbooks and data sources, databases can often be accessed by different data sources in the same workbook, or different workbooks in a site. Prior to query tags, it was hard to understand which Tableau resource was responsible for which query. With query tagging enabled, events in the source database's logs will be tagged with the following attributes of the workbook, dashboard, or worksheet (view) that caused the query to be generated:
- Workbook, worksheet, or dashboard LUID
(Some queries are not associated with an identifiable workbook, worksheet, or dashboard and may not be tagged with these attributes.) - Currently authenticated user name (Tableau Server only)
- Currently authenticated user LUID
- Site LUID
Not all activities that generate queries are guaranteed to emit tags. For instance, tagging for queries generated by activities of Ask Data, accelerated workbooks, extract refresh, and Tableau Prep features is not currently supported.
In Tableau Server 2023.3 and later:
- Query tagging can be configured using the TSM configuration set command with the
native_api.UserInfoInGeneratedSQLEnabled
(Link opens in a new window)) key. The command enables or disables tagging for all content on a Tableau Server. - Metadata and command queries are tagged
For API 3.9 to API 3.22 (up to Tableau Cloud February 2024 and Server 2023.1):
-
For connections not enabled by default, you can enable query tagging by using the REST API version 3.22 or lower to update the data source connection by setting the
queryTaggingEnabled
attribute of the Update Data Source Connection adding Update Workbook Connection methods.Note: In API 3.23, the
queryTaggingEnabled
attribute is deprecated for those two methods. - The REST API does not allow modifying connections to data sources that require OAuth or multi-factor authentication (for example, BigQuery), so it is not possible to enable query tagging for these connections using the REST API. In Tableau Server 2023.3 and later, query tagging can be enabled for these connections using TSM as described in the following section.
Find the Tableau Origin of Queries to Your Database
Once an issue with SQL queries to your database is identified, enable query tagging to capture metadata for troubleshooting. At a high level the process for investigating an issue might look like:
- IDENTIFY THE ISSUE IN YOUR DATABASE: Your database administrator notices a recurring query event that regularly shows an unexpectedly long run time. They call you, as Tableau administrator, to understand where the query originates and discuss how to make things more efficient.
-
ENABLE QUERY TAGGING FOR YOUR SERVER: To understand how the problem query event is tied to your workbook(s), you first need to understand what connections are making queries to the database.
For Tableau Server 2023.3 and later: You can use TSM to enable query tagging for all content on a server by setting
native_api.UserInfoInGeneratedSQLEnabled
(Link opens in a new window) to true.For API 3.9 to API 3.21 (up to Tableau Cloud February 2024 and Server 2023.1): Use the Tableau REST API to enable query tagging for the right connections. The following section shows one way to do this using a Python script.
Note: This step is not required for data connections using Hyper, Snowflake, or Teradata as query tagging is enabled by default for those data stores.
On Tableau Server, you can choose to disable query tagging for Snowflake or Teradata by using TSM configuration set options to set the value ofResourceLuidsInSnowflakeSessionQueryTagsEnabled
(Link opens in a new window) orResourceLuidsInTeradataSessionQueryBandsEnabled
(Link opens in a new window) tofalse
. -
LOOK FOR THE TABLEAU QUERY ORIGINATOR IN YOUR DATABASE MONITORING: At this point, query tagging is enabled for the connections between your workbook or data source and your database. When you find an instance of your recurring problem query in your database log, history, or analytics, you will see that the related query has a comment string appended beginning with
/* "tableau-query-origins": "..." */
. That string includes the LUID and other attributes listed above from the originating workbook, dashboard, or view.Note, for Snowflake and Teradata the query originator data is not in the SQL text. For Snowflake connections, the query origins information is in the Session Query Tag(Link opens in a new window). For Teradata connections, the query origins information is in the Session Query Band(Link opens in a new window).
- EXAMINE THE ORIGINATING TABLEAU RESOURCE: Now that you know the LUID of the Tableau element that is making the problem query, you can find its name and properties with a request to the get workbook method.
Python Script Example (for API 3.9 through 3.22)
The queryTaggingEnabled
attribute of the
REST API methods described in this script are deprecated in API 3.23 (Tableau Cloud February 2024 / Server 2024.2).
For Tableau Server 2023.3 and later, use TSM to manage query tagging as described in the previous section.
The Python script in this section illustrates how to use the REST API to find and enable query tagging for the connections that use your database’s server address. To find each of these connections, using your Tableau server address, auth token, and site LUID, the script:
Iterates through each data source in the site using query data sources(Link opens in a new window). You could also choose to iterate through each workbook using query workbooks, instead.
Where workbooks on a site share data sources, choose using data source connections to find which resource originates a query. To include embedded data sources, use workbook connections.
- For each data source, iterates through each of its connections using query data source connections. If you queried workbooks, then use query workbook connections.
- For each connection that uses your database as its serverAddress, enable query tagging for those connections by
setting
queryTaggingEnabled=true
in a request to update data source connection or update workbook connection.
import requests # Contains methods used to make HTTP requests
import xml.etree.ElementTree as ET # Contains methods used to build and parse XML
import sys
import math
import getpass
VERSION = 3.9 # up to 3.22
xmlns = {'t': 'http://tableau.com/api'}
class ApiCallError(Exception):
pass
class UserDefinedFieldError(Exception):
pass
def _encode_for_display(text):
"""
Returns an ASCII-encoded version of the text.
"""
return text.encode('ascii', errors="backslashreplace").decode('utf-8')
def _check_status(server_response, success_code):
if server_response.status_code != success_code:
parsed_response = ET.fromstring(server_response.text)
error_element = parsed_response.find('t:error', namespaces=xmlns)
summary_element = parsed_response.find('.//t:summary', namespaces=xmlns)
detail_element = parsed_response.find('.//t:detail', namespaces=xmlns)
code = error_element.get('code', 'unknown') if error_element is not None else 'unknown code'
summary = summary_element.text if summary_element is not None else 'unknown summary'
detail = detail_element.text if detail_element is not None else 'unknown detail'
error_message = '{0}: {1} - {2}'.format(code, summary, detail)
raise ApiCallError(error_message)
return
def get_datasource_connections(server, auth_token, site_id, datasource_id):
url = server + "/api/{0}/sites/{1}/datasources/{2}/connections?fields=id,serverAddress".format(VERSION, site_id, datasource_id)
server_response = requests.get(url, headers={'x-tableau-auth': auth_token})
_check_status(server_response, 200)
xml_response = ET.fromstring(_encode_for_display(server_response.text))
return xml_response.findall('.//t:connection', namespaces=xmlns)
def get_datasource_and_connection_ids(server, auth_token, site_id):
connected_server_address = "your_database_.test.tsi.lan"
print("get ids")
url = server + "/api/{0}/sites/{1}/datasources?fields=id".format(VERSION, site_id)
server_response = requests.get(url, headers={'x-tableau-auth': auth_token})
_check_status(server_response, 200)
xml_response = ET.fromstring(_encode_for_display(server_response.text))
id_pairs = []
datasources = xml_response.findall('.//t:datasource', namespaces=xmlns)
for datasource in datasources:
connections = get_datasource_connections(server, auth_token, site_id, datasource.get('id'))
for connection in connections:
if connection.get('serverAddress') == connected_server_address:
id_pair = ( datasource.get('id'), connection.get('id') )
id_pairs.append(id_pair)
print(id_pair)
return id_pairs
def update_datasource_connection(server, auth_token, site_id, datasource_id, connection_id):
xml_request = ET.Element('tsRequest')
connection_element = ET.SubElement(xml_request, 'connection', queryTaggingEnabled=true)
xml_request = ET.tostring(xml_request)
url = server + "/api/{0}/sites/{1}/datasources/{2}/connections/{3}".format(VERSION, site_id, datasource_id, connection_id)
server_response = requests.put(url, data=xml_request, headers={ 'x-tableau-auth': auth_token })
_check_status(server_response, 200)
def main():
server = "http://your_tableau_server"
auth_token = "54RO8RefTAuU5LlfJ0dhyQ|siwMGk8Ad5woJK7Jsor2Z8Mj7CYmGTFO"
site_id = "b27009a5-7b4f-44d1-ba9e-4ad9ae22b97d"
datasource_and_connection_ids = get_datasource_and_connection_ids(server, auth_token, site_id)
for datasource_and_connection_id in datasource_and_connection_ids:
update_datasource_connection(datasource_and_connection_id[0], datasource_and_connection_id[1])