Configure Tableau Server for High Availability with Coordination Service-Only Nodes

The Coordination Service is built on Apache ZooKeeper, an open-source project, and coordinates activities on the server, guaranteeing a quorum in the event of a failure, and serving as the source of "truth" regarding the server topology, configuration, and state. The service is installed automatically on the initial Tableau Server node, but no additional instances are installed as you add additional nodes. Because the successful functioning of Tableau Server depends on a properly functioning Coordination Service, we recommend that for server installations of three or more nodes, you add additional instances of the Coordination Service by deploying a new Coordination Service ensemble. This provides redundancy and improved availability in the event that one instance of the Coordination Service has problems.

The Coordination Service can generate a large amount of I/O as it communicates with other components of the server, so if you are running Tableau Server on computers that meet or just exceed the minimum hardware requirements, you may want to install Tableau Server in a configuration that uses Coordination Service-only nodes. This means installing Coordination Service on nodes that run no other server processes, and removing Coordination Service from any nodes that are running other server processes. This procedure explains how to do this. You can also run the Coordination Service ensemble on the same nodes running other Tableau Server processes. For details on how to do that, see Deploy a Coordination Service Ensemble .

Prerequisite

Before proceeding with the procedures in this topic, complete the following prerequisites:

Note: This operation includes steps that you may need to perform using the TSM command line. To use the TSM CLI you need administrator access to the command line on one of the nodes in your installation and TSM administrator credentials to run TSM commands.

Deploy an ensemble on Coordination Service-only nodes

One way to accommodate the high I/O impact of the Coordination Service is to deploy an ensemble on nodes that only run the Coordination Service and the Cluster Controller. The following steps illustrate how to deploy a Coordination Service ensemble on an existing multi-node Tableau Server cluster.

Note: For a core-based Tableau Server license, Coordination Service-only nodes do not count against the total count of licensed cores.

  1. Add additional nodes to your cluster.

    See Install and Configure Additional Nodes.

  2. If you added the new nodes using the TSM CLI, you need to configure the nodes with Cluster Controller (this step is not necessary if you added the nodes using the TSM Web UI because Cluster Controller is automatically added when you add a node with the Web UI).

    On the initial node, open a command prompt as administrator.

  3. Type this command to sign in to Tableau Server as a TSM administrator:

    tsm login -u <username>

    You will be prompted for your password.

  4. From the initial node of the cluster, configure the new nodes with an instance of the Cluster Controller:

    tsm topology set-process -pr clustercontroller -n <node4> -c 1

    tsm topology set-process -pr clustercontroller -n <node5> -c 1

    tsm topology set-process -pr clustercontroller -n <node6> -c 1

  5. Apply the configuration changes. The pending-changes apply command displays a prompt to let you know this will restart Tableau Server if the server is running. The prompt displays even if the server is stopped, but in that case there is no restart. You can suppress the prompt using the --ignore-prompt option, but this does not change the restart behavior. For more information, see tsm pending-changes apply.

    tsm pending-changes apply

    A warning about deploying a Coordination Service ensemble displays because you have deployed a multi-node cluster. If this is the only warning, you can safely override it using the --ignore-warnings option to apply the configuration changes in spite of the warning.

    tsm pending-changes apply --ignore-warnings
  6. Confirm that all nodes are up and running:

    tsm status -v

  7. On the initial node of the cluster, open a terminal session and type this command to stop Tableau Server:

    tsm stop

  8. Get the node IDs for each node in the cluster:

    tsm topology list-nodes -v

  9. Use the tsm topology deploy-coordination-service command to add a new Coordination Service ensemble by adding the Coordination Service to specified nodes. You must specify the node(s) that the Coordination Service should be added to. The command also switches Tableau Server to use the new ensemble.

    For example, deploy the Coordination Service to three nodes of a six-node cluster:

    tsm topology deploy-coordination-service -n <node4,node5,node6>

  10. Wait until the new Coordination Service ensemble is running and the server is ready for the next step. This is important.

    Important. If you attempt to clean up the old Coordination Service ensemble before the server is in the proper state, you can put the server into an unrecoverable state and may need to completely reinstall Tableau.

    1. Check the status of the server:

      tsm status -v

      If the deployment is not complete, you may see processes showing as running when they are not, and the Coordination Service showing a status of "unavailable" while the service is synchronizing between nodes on the cluster. Tableau Server may show as being in an error state while this is happening. You may also get an error message: "Could not connect to TSM Controller at '<host>:8850'." This is normal when the server is returning to a valid state.

    2. Check the status of the server periodically until you are prompted to sign in again.

    3. When you are prompted, sign in to TSM and continue to check the server status until you see a status of "STOPPED" for each node. If the status of a node shows as "ERROR" you need to wait. When each node status is "STOPPED" you should also see the following services running:

      On the initial node:

      • Two instances of the Coordination Service on the initial node, both with a status of "running".

      • The Administration Controller with a status of "running". (The Administration Controller is only installed on the initial node.)

      • The Administration Agent with a status of "running".

      • Additional services on the initial node, all with a status of "running": Service Manager, License Manager, Client File Service.

      On the additional nodes:

      • One or more instance of the Coordination Service on each additional node you specified when you deployed the new ensemble, all with a status of "running". If you are deploying a new ensemble to nodes that already had Coordination Service running, you will see two instances of the service.

      • The Administration Agent on every node, with a status of "running".

      If you do not see a status of "running" for all of the above, wait a few minutes and run the status command again.

      Note: If there is a problem with an instance of the Coordination Service (if it shows as stopped for example), you can toggle back to your previous Coordination Service ensemble using the tsm topology toggle-coordination-service command. To do this, the rest of the services should be in the state described above, including Administration Controller and Agent. You can toggle back to the previous ensemble only if you have not run the cleanup-coordination-service command. Tableau Server cannot be running when you use this command.

  11. When the new ensemble is running properly, remove the old ensemble. This step is required. You cannot run Tableau Server with multiple Coordination Service ensembles configured.

    tsm topology cleanup-coordination-service

    Tableau Server must be stopped when you use this command.

  12. Start Tableau Server:

    tsm start

Thanks for your feedback! There was an error submitting your feedback. Try again or send us a message.