Troubleshoot Errors After Upgrade
The following sections detail issues frequently identified by post-upgrade checks or observed immediately following an upgrade.
For information about pre-upgrade and post-upgrade checks, see Upgrade Reference section.
Services not running
The RMT web user interface is inaccessible, and the rmtadmin status command indicates that the required services have stopped.
It is detected by post-upgrade checks #1, #2, #16
To fix this, use the following steps:
For Linux, use the following command:
sudo -u tabrmt-master XDG_RUNTIME_DIR=/run/user/$(id -u tabrmt-master) systemctl --user start tabrmt-master
For Windows, use the following command:
Start-Service TableauResourceMonitoringTool
Check the service logs in the log directory for start up errors.
PostgreSQL not accepting connections
When the Web UI shows database errors and the data is not being stored.
It is detected by post-upgrade checks #5c, #6
To fix this, check the following:
-
Verify the PostgreSQL service is running
-
Check PostgreSQL logs in the data directory for startup errors
-
Ensure the database port (default 5555) is not blocked by another process
RabbitMQ not responding
When Agents show as disconnected and no new monitoring data appears.
It is detected by post-upgrade checks #5b, #7, #9
To fix this, check the following:
-
RabbitMQ may take 30–60 seconds to fully initialize after upgrade — wait and re-check
-
Verify the Erlang/beam process is running
-
Check for RabbitMQ resource alarms (disk space < 10 GB blocks all message publishing)
-
Verify the Erlang cookie file exists and has correct permissions
Message queue backlog
Data appears delayed in the RMT web UI and dashboards show stale metrics.
It is detected by post-upgrade check #8.
A backlog immediately after upgrade is normal as agents reconnect and resend buffered data. To fix this, do the following:
-
Monitor with `rmtadmin status` — the backlog should clear within 15–30 minutes
-
If the backlog persists, check that all agent services are running and connected
RMT Server workers not consuming (data Not Ingested After Upgrade)
All RMT services report Running, the post-upgrade check passes services/ports/connectivity, but the web UI shows no new data and agents appear as "Connection Issue". `rmtadmin status` shows one or more queues with messages accumulating but a consumer count of **0**.
This occurs when the RMT Server's background workers start but fail to re-bind their RabbitMQ consumer channels after an in-place upgrade. Agents publish successfully, messages pile up in RabbitMQ, and nothing lands in the database.
It is detected by post-upgrade check #8a
To fix this, do the following:
-
Restart the master:
`rmtadmin restart` -
Wait ~60 seconds, then re-run `
rmtadmin status` — the stranded queues should now show `Consumers = 1` and the backlog should drain within a few minutes -
If the queues still show `Consumers = 0` after the restart, inspect the background log for consumer-binding exceptions:
-
Windows: `
C:\Program Files\Tableau\Tableau Resource Monitoring Tool\master\logs\background\*.log` -
Linux: `
/var/opt/tableau/tabrmt/master/logs/background/*.log` -
Search the log for `
exception`, `channel`, or `binding failed`
FIPS Mode Mismatch
When services fail to start with cryptographic or TLS errors.
It is detected by post-upgrade check #11
To fix this, do the following:
-
Ensure the OS-level FIPS setting matches the RMT configuration (`isFIPSEnabled` in setup.json)
-
Linux: check
`/proc/sys/crypto/fips_enabled` -
Windows: check `
HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\FIPSAlgorithmPolicy\Enabled`
Configuration file corruption
Services fail to start with JSON parsing errors.
It is detected by post-upgrade check #3
To fix this, do the following:
-
Validate the config file: `
python3 -c "import json; json.load(open('config.json'))"` -
Restore from the pre-upgrade backup (recommended to take before every upgrade)
SSL certificate expired or expiring
When web browser shows certificate warnings and agents fail to connect with TLS errors.
It is detected by post-upgrade checks #10, #13
To fix this, do the following:
-
Renew the HTTPS certificate and update it using
`rmtadmin set-ssl-certificate` -
For RabbitMQ certificates, replace the PEM files in the certificates directory and restart the service
Disk space exhaustion
When services crash or fail to write data and database errors in logs.
It is detected by post-upgrade check #14, pre-upgrade checks #2, #2b
To fix it, do the following:
-
Free disk space or expand the volume
-
Run database cleanup: ensure `
db:cleanup:type` is set to `Purge` (not `None`) in config.json -
Archive or delete old log files
FATAL or CRITICAL errors in logs
Unexpected behavior or service instability after upgrade.
It is detected by post-upgrade check #15
To fix this, do the following:
-
"Broker unreachable" errors immediately after upgrade are typically transient. Verify RabbitMQ is now running
-
For other FATAL errors, review the full log entry and contact Tableau Support with the log file
Agent disconnections after RMT Server upgrade
Agents show as disconnected in the RMT web UI; no monitoring data from specific Tableau Server nodes.
It is detected by pre-upgrade checks #36b, #36c, #36d
Agents automatically reconnect after the RMT Server upgrade completes. Allow 5–10 minutes for this to fix. Else, check the following:
-
Verify the Agent service is running on each Agent host
-
If agents are running a much older version, upgrade them to match the RMT Server version
-
Check network connectivity (AMQP port 5672) between Agent and RMT Server hosts
Known-bad version regression
When specific features are broken due to a known defect in the installed version.
It is detected by pre-upgrade check #36a
To fix this, do the following:
-
Upgrade to the next patch release (e.g., from 2025.3.0 to 2025.3.1+)
-
Contact Tableau Support for version-specific patches
Hangfire cleanup broken (W-19380780)
Database grows continuously and the hangfire schema is abnormally large (>1 GB).
It is detected by pre-upgrade checks #21d, #21g
To fix this, do the following:
-
Upgrading to a fixed version corrects the
delete_hash()function volatility -
Orphaned rows in the hangfire schema may require manual cleanup. Contact Tableau Support.
Database cleanup disabled
Database size grows over time without bound and upgrade becomes slow.
It is detected by pre-upgrade check #21c
To fix this, do the following:
-
Set
db:cleanup:typetoPurgeinconfig.json -
Reduce
db:cleanup:afterDaysto 30–90 days (default: 14) -
Restart the RMT Director service for the change to take effect
File permission issues
Services fail to start or cannot read configuration / write logs.
It is detected by post-upgrade check #4, pre-upgrade checks #11–#16
To fix this, check the following:
-
Linux: Verify ownership and group membership of config, log, and data directories
-
Windows: Verify the service account has read/write access to the installation directories
-
Contact Tableau Support if permission issues persist after a fresh install
Stranded RMT processes after rmtadmin stop
After running rmtadmin stop per the documented upgrade procedure, one or more tabrmt-master, tabrmt-master-background, tabrmt-agent, or tabrmt-agent-background processes are still running. If the upgrade proceeds in this state, the installer can fail mid-run with file-in-use errors.
It is detected by pre-upgrade check #38a
To fix this, do the following:
-
Wait ~30 seconds for graceful shutdown, then re-run the pre-upgrade check
-
If processes persist:
-
Linux: `
pkill -u tabrmt-master -f tabrmt-master ; pkill -u tabrmt-agent -f tabrmt-agent` (skip the ones that do not apply on this node) -
Windows: `
Stop-Process -Name tabrmt-master,tabrmt-master-background,tabrmt-agent,tabrmt-agent-background -Force` -
Re-run the pre-upgrade check to confirm a clean state before starting the upgrade
False failures during co-located upgrades
If your RMT Master and Agent are co-located on the same machine, the Agent’s pre-upgrade check (specifically check #38a) may incorrectly detect Master processes. Because the script scans for all tabrmt-* processes instead of filtering strictly for tabrmt-agent-*, it may trigger a false FAIL status.
Most environments use separate machines and will not encounter this. However, if you are using a co-located setup and receive this specific failure, you can safely bypass the check and proceed with the installation by using the following :
--skip-upgrade-checks
Technical Details:
Root Cause: Check #38a matches overly broad process names (tabrmt-*).
Scope: Only affects in-place upgrades where Master and Agent share the same OS instance.
Risk Level: Low. The Master processes are out of scope for the Agent upgrade, so the failure does not indicate an actual system conflict.
