Manage Data Connect

Data Connect operates as a shared responsibility model. With this model, customers supply the physical or virtual compute resources, and Tableau hosts and manages the Data Connect Kubernetes cluster on those resources. Tableau reduces the overhead of administration by remotely managing, monitoring, and maintaining the Kubernetes cluster. In this model, Tableau is responsible for operating the Data Connect service securely and customers are responsible for managing the infrastructure and networking layers.

Managing Data Connect nodes

As a part of the shared responsibility model, Tableau manages troubleshooting the health of Bridge clients that are deployed on Data Connect nodes. You are responsible for the health of the nodes themselves, keeping the computers up-to-date and healthy and managing network connectivity. To support managing Data Connect nodes, Tableau will provide alerts in Tableau Cloud when node action is required.

Node licenses

Data Connect is licensed by the node and is available for purchase by customers who are on the Tableau Cloud Enterprise or Tableau+ editions. Nodes can only be added to a single cluster at a time. Learn more at About Data Connect.

Node licenses are shared across all sites on a Tableau Cloud Manager tenant. This allows you to share clusters across different sites without having to pay for separate node licenses. If you see that Data Connect node licenses are consumed before you have set up a cluster, this means that another site on your tenant has already set up Data Connect nodes. Consider sharing Data Connect nodes with other sites on your tenant to reduce overall costs.

Monitoring node health

In the case where a node becomes unhealthy or loses connectivity with Tableau Cloud, details will be provided to Tableau Cloud site admins.

Node utilization percentages are provided for near real-time monitoring of your deployment. Values provided are refreshed every 15 seconds and will display updated values when you refresh your browser. When node computers surpass 90% CPU utilization or 80% memory load, alerts will surface within the Tableau Cloud user experience.

For more information on troubleshooting node issues, see Troubleshooting service initialization and health.

Adding and removing nodes from a cluster

Nodes can be added and removed from clusters to accommodate different usage patterns. However they are intended to serve a single cluster at a time and are not intended for dynamically changing between Clusters. To add or remove a node, you must be a Site Admin on the cluster owner site. See Add a node from an existing cluster and Remove a node from an existing cluster.

Managing node capacity

As Data Connect workloads grow, you may begin to run into limitations of the existing infrastructure initially provided. There are two ways to expand Data Connect service capacity. The first is to increase the size of the computer you are using for your Data Connect node. The second is to increase the number of nodes you have within a cluster.

Increasing CPU on a machine will particularly benefit the throughput of live queries and embedded data source extracts on a node. Increasing memory will particularly benefit the performance of published data source extracts on a node.

Increasing the number of nodes within a cluster will increase the throughput of the system overall.

For more information about capacity planning, download the whitepaper, Accessing Your Private Network Data with Tableau Cloud.

Making changes to existing nodes

While Data Connect is actively deployed you may need to make operational changes to your node infrastructure. This can include adding memory or CPU to your infrastructure, rotating a node new machine or performing maintenance on a node.

Changing the node identity (hostname or IP address) or otherwise impacting network availability will result in downtime of the Data Connect node.

If you are changing the node identity (hostname or IP address), you will need to take additional steps to delete the node within Tableau Cloud prior to making your changes and then onboard that node as a new node after the changes are complete.

Example Scenario #1: Add resources to an existing computer without requiring restart

Some cloud providers allow you to add resources to an existing machine (CPU/memory) without restarting the computer. In this scenario, no Data Connect changes are required and the service will utilize the additional resources without requiring Data Connect action.

Example Scenario #2: Making changes that require computer restart but do not impact networking or machine identity

Oftentimes computer maintenance or adding resources to an existing computer (CPU/memory) will require a restart of the computer itself, but do not impact computer identity or networking. In these cases, the node requiring a restart will become disconnected from Tableau Cloud during its restart. After it restarts, it will also need to start the Data Connect service on the computer. After this process is complete (typically less than 1 hour), the node will show as “Available” within Tableau Cloud. There should be no other action required to enable Data Connect.

If you unexpectedly change the computer identity, please see Example Scenario #3.

If you unexpectedly impact networking of the computer during maintenance, see Troubleshooting service initialization and health.

Example Scenario #3: Making changes that require a change to node identity (hostname or IP address)

In this scenario, you must first disconnect the node from Tableau Cloud by deleting the node before making any changes. After the node is disconnected, conduct the required maintenance, and add the node back to the cluster.

Example Scenario #4: Making changes that impact networking

This scenario is most common if another team at your organization or a third party hosts the computer for you. If they conduct maintenance on the computer and it impacts the computer's networking, then the node will become unavailable within Tableau Cloud. To resolve this issue you will need to resolve the networking issue, and in some cases, restart the computer. For more details on the appropriate networking configurations, see Troubleshooting service initialization and health.

Sharing Data Connect clusters

Data Connect clusters and their corresponding nodes can be used by all sites on a Tableau Cloud tenant (Tableau Cloud Manager). Sharing infrastructure reduces the cost of running Data Connect across your Tableau Cloud deployment. Sharing clusters allows queries on different sites within the tenant to use the same node infrastructure.

After a cluster has been shared, site admins on the shared sites will be responsible for setting up the pools in their site. After the pools have been established, queries on the site will begin to use Data Connect for queries sent to the domains specified in the pools.

See (Optional) Step 4: Share clusters across sites.

Roles and responsibilities for shared clusters

Cluster owners and cluster recipients work together to manage shared clusters.

Cluster owners

The site admins of the site that originally configured the Data Connect cluster are referred to as the cluster owners. Cluster owners are responsible for ensuring the health of the Data Connect nodes on behalf of all sites that use the cluster. Cluster owners are also responsible for configuring the pools on the site that owns the cluster, as well as monitoring the health of individual queries of that site. However, configuring pools and monitoring the health of individual queries on all other sites will be the responsibility of the sites those pools are configured on.

Cluster recipients

The site admins of the sites who did not originally configure the Data Connect cluster are referred to as the cluster recipients. Cluster recipients do not have visibility into node health and are unable to take action on node health from their site. Cluster recipient site admins are only responsible for establishing pools and monitoring individual query health.

Any communication about node health for cross-site purposes must take place outside of Tableau Cloud.

Shared cluster responsibility summary

Responsibility Cluster owner Cluster recipient

Setup

Cluster setup

Networking configuration

Install drivers for all sites

Add base images for all recipient sites

Provide pool ID and driver requirements to cluster owner
Enable Share the cluster  
Pool configuration Managed by site admins on owning site Managed by site admins on receiving site
Monitoring Cluster owner monitors Data Connect health at cluster level  
Query health Monitors on own site Monitors on own site

Troubleshooting service initialization and health

Data Connect uses a shared responsibility model to deploy Tableau Bridge within your environment and keep private network data up-to-date. This troubleshooting section covers issues that may arise while setting up Data Connect and issues that cause existing clusters and nodes to become unavailable.

If you experience initiation issues, verify the following connectivity and access:

  • Data Connect infrastructure, cluster, and container require networking access to the orchestration provider services (#2 in the image above) and to Tableau Cloud (outbound only, #5).

  • Data Connect infrastructure, cluster, container, and Agent require networking access to your database (#6).

See Networking specifications.

Monitoring Data Connect query health

Data Connect deploys and manages Bridge clients. Individual Bridge clients are responsible for the queries sent to your private network data. To learn more, see Architecture.

There are several monitoring solutions available in Tableau Cloud to help you manage different monitoring scenarios for individual queries handled by Data Connect. All of these solutions are available to all admins. Admins can make Admin Insights and Activity Log available to non-admin users.

Real-time monitoring

Historical analysis

  • Admin Insights: Monitor job status and performance using Tableau Cloud data sources that are published in your environment. This data is updated daily and shows historical data for longer term analysis. It can be shared with non-admin users if desired. See Use Admin Insights to Create Custom Views.

  • Activity Log: Monitor job status and performance using log data that can be stored as long as necessary for your analysis. This data can be shared with non-admin users if desired. Available for Tableau Cloud Enterprise and Tableau+ customers only. See Activity Log.

Please note that for shared Data Connect clusters, monitoring of Data Connect queries is managed at the site level. Site admins will only be able to monitor queries that use the pools in their site(s).

For a full description of monitoring individual jobs through various user scenarios, download the whitepaper, Accessing Your Private Network Data with Tableau Cloud.

For information related to troubleshooting individual errors, see Troubleshooting Errors from Individual Queries.