Recover from an Initial Node Failure

The first computer you install Tableau on, the "initial node," has some unique characteristics. Two processes run only on the initial node and cannot be moved to any other node except in a failure situation, the License service (License Manager) and TSM Controller (Administration Controller). Tableau Server includes a script that automates moving these two processes to one of your other existing nodes so you can get complete access back to TSM and keep Tableau Server running.

Two other processes are initially included on the initial node but can be added or moved to additional nodes, the CFS (Client File Service) and the Coordination Service. Depending on how your installation was configured with CFS and Coordination Service, you may also need to take steps to redeploy these.

If there is a problem with the initial node and you have redundant processes on your other nodes, Tableau Server can continue to run for up to 72 hours before the lack of the licensing service impacts other processes. Your users can continue to sign in and see and use their content after the initial node fails, but you will not be able to reconfigure Tableau Server because you won't have access to the Administration Controller. This means you should make a point of moving the two unique processes to another of your running nodes as soon as possible. If your initial node fails for reasons that are recoverable in a relatively short amount of time (for example, a hardware failure you can correct), you should first attempt to bring the node back up without using the procedure below.

Note: The steps in this article require server downtime and can be disruptive, and should only be used in the event of a catastrophic failure of the initial node. If you are unable to get your initial node running again, use the following steps to move key TSM processes to another node in your cluster.

General requirements

  • If the initial node was running the only instance of the Client File Service (CFS), you need to add that process to another node. Tableau Server requires at least one instance of the CFS. For more information, see Configure Client File Service .
  • As part of the process for setting up a multi-node Tableau Server installation you should have deployed a Coordination Service ensemble. The process below assumes there was a Coordination Ensemble deployed before there was a problem with the initial node. For more information about deploying a Coordination Service ensemble, see Deploy a Coordination Service Ensemble .

Note: This operation includes steps that you may need to perform using the TSM command line. To use the TSM CLI you need administrator access to the command line on one of the nodes in your installation and TSM administrator credentials to run TSM commands.

Move the TSM Controller and License Service to another node

If there is a problem with the initial node, the TSM Controller and the Licensing Service need to be started on another node. Follow these steps to use the provided move-tsm-controller script and get the Controller and Licensing Service working on another node.

  1. On a node that is still working, run the Controller recovery script. To do this, open a command prompt, navigate to the Tableau Server script directory (By default: C:\Program Files\Tableau\Tableau Server\packages\scripts.<version_code>\ ), and type the following command:

    move-tsm-controller -n <nodeID>

    where "nodeID" is the ID for the node you want the TSM Controller to run on. For example:

    move-tsm-controller -n node2

  2. Close and reopen the command window and verify the Administration Controller is running on the node by typing this command:

    tsm status -v

  3. Stop Tableau Server:

    tsm stop

  4. Add the License Service to the node:

    tsm topology set-process -pr licenseservice -n <nodeID> -c 1

  5. Remove the old License Service from the original node, where "nodeID" is the initial node that has failed:

    tsm topology set-process -pr licenseservice -n <nodeID> -c 0

  6. If the initial node had been running the only instance of CFS, add CFS to this node:

    tsm topology set-process -pr clientfileservice -n node2 -c 1

  7. (Optional) You can also add other processes that had been running on the initial node but are not running on this node. For instance, to add an cache server:

    tsm topology set-process -pr cacheserver -n node2 -c 1

  8. Apply the changes:

    tsm pending-changes apply

    The pending-changes apply command displays a prompt to let you know this will restart Tableau Server if the server is running. The prompt displays even if the server is stopped, but in that case there is no restart. You can suppress the prompt using the --ignore-prompt option, but this does not change the restart behavior. For more information, see tsm pending-changes apply.

  9. Restart the TSM Administration Controller:

    net stop tabadmincontroller_0

    net start tabadmincontroller_0

    Note: You must run these commands as an administrator from a command prompt. Depending on how your computer is configured, you may need to run them in the C:\Windows\System32 folder.

    Note: It may take a few minutes for tabadmincontroller to restart. If you attempt to apply pending changes in the next step before the controller has fully restarted, TSM will not be able to connect to the controller. You can verify that the controller is running by using the tsm status -v command. Tableau Server Administration Controller should be listed as "is running".

  10. Apply pending changes (there may not appear to be any, but this step is required):

    tsm pending-changes apply

  11. Activate the Tableau Server license on the new Controller node:

    tsm licenses activate -k <product-key>

  12. Verify the license is properly activated:

    tsm licenses list

  13. If the initial node was running the Coordination Service, you need to deploy a new Coordination Service ensemble that does not include that node. If you have a three node cluster and the initial node was running the Coordination Service, you must deploy a new, single-instance Coordination Service ensemble on a different node and clean up the old ensemble. In this example, a single instance of the Coordination Service is being deployed to the second node:

    tsm topology deploy-coordination-service -n <nodeID2>

    Wait until the server is completely switched over to the new ensemble.

  14. When the server has switched over to the new ensemble, clean up the old ensemble.

    Important: Do not do this too soon. You must wait until the server has completely switched to the new ensemble before running the cleanup command or you can permanently break Tableau. For more information about deploying a Coordination Service ensemble, including detailed instructions for determining that the server is ready to clean up the old ensemble, see Deploy a Coordination Service Ensemble .

    tsm topology cleanup-coordination-service

  15. If the initial node was running a File Store instance, you need to remove that instance:

    tsm topology filestore decommission -n <nodeID> --delete-filestore

    Where nodeID is the initial node that has failed.

  16. Apply pending changes, using the --ignore-warnings flag if the new Coordination Service ensemble you deployed above is a single node ensemble:

    tsm pending-changes apply --ignore-warnings

  17. Remove the initial node, where nodeID is the initial node that has failed:

    tsm topology remove-nodes -n <nodeID>

  18. Apply pending changes, using the --ignore-warnings flag if the new Coordination Service ensemble you deployed above is a single node ensemble:

    tsm pending-changes apply --ignore-warnings

  19. Start Tableau Server:

    tsm start

    At this point your server should start, and you will be able to use TSM to configure it. The next step is to replace your initial node so your cluster has the original number of nodes. How you do this depends on whether or not you want to reuse the node that failed. We recommend that you only reuse that node if you are able to identify the reason it failed, and take steps to keep the failure from recurring.

  20. If you plan to reuse the original node, you first need to completely remove Tableau from it. Do this by running the tableau-server-obliterate script. For details on doing this, see Remove Tableau Server from Your Computer.

  21. On a fresh computer, or on your original computer after completely removing Tableau, install Tableau using your original Setup program and a bootstrap file generated from the node that is now running the Administration Controller and Licensing Service. This creates an additional node you can configure as part of your cluster. For details on how to add the node, see Install and Configure Additional Nodes.

    A best practice is to configure any processes you lost when the original node failed, to make sure your cluster is fully redundant. You may want to move processes from your new initial node to the newly added additional node to duplicate your original configuration. For example, if your initial node was only running gateway and File Store, you may want to configure the new initial node the same way.

  22. You should also redeploy a new Coordination Service ensemble, once you have your nodes up and running the way you want. For details, see Deploy a Coordination Service Ensemble .

 

Thanks for your feedback! There was an error submitting your feedback. Try again or send us a message.