Use R (Rserve) scripts in your flow

Disclaimer: This topic includes information about a third-party product. Please note that while we make every effort to keep references to third-party content accurate, the information we provide here might change without notice as R and Rserve changes. For the most up-to-date information, please consult the R and Rserve documentation and support.

R is an open source software programming language and a software environment for statistical computing and graphics. To extend the functionality of Tableau Prep Builder, you can create scripts in R to use in your flow that run through an Rserve server to produce output that you can further work with in your flow.

For example, you might want to add statistical modeling data or forecasting data to the data that you already have in your flow using a script in R, then use the power of Tableau Prep Builder to clean the resulting data set for analysis.

To include R scripts in your flow, you need to configure a connection between Tableau Prep Builder and an Rserve server. Then you can use R scripts to apply supported functions to data from your flow using R expressions. After you enter the configuration details and point Tableau Prep Builder to the file and function that you want to use, data is securely passed to the Rserve server, the expressions are applied, and the results are returned as a table (R data.frame) that you can clean or output as needed.

Note: To run flows that include script steps on Tableau Server (version 2019.3 and later), Tableau Server must also have a connection to an Rserve server.


To include R script steps in your flow, install R and configure a connection to an Rserve server.


Configure Rserve Server for Tableau Server

If you intend to publish flows that include script steps and run them on Tableau Server you will need to configure a connection between your Rserve server and Tableau Server version 2019.3 or later. Running flows with script steps in Tableau Online isn't currently supported.

  1. Open the TSM command line.
  2. Enter the following commands to set the host address, port values, and connect timeout:

    tsm security maestro-rserve-ssl enable --connection-type {maestro-rserve-secure/maestro-rserve} --rserve-host <Rserve IP address or host name> --rserve-port <Rserve port> --rserve-username <Rserve username> --rserve-password <Rserve password> --rserve-connect-timeout-ms <RServe connect timeout>

    • Select {maestro-rserve-secure} to enable a secure connection or {maestro-rserve} to enable an unsecured connection.
    • If you select {maestro-rserve-secure}, specify the certificate file -cf<certificate file path> in the command line.
    • Specify the --rserve-connect-timeout-ms <RServe connect timeout> in milliseconds. For example --rserve-connect-timeout-ms 900000.
  3. To disable the Rserve connection enter the following command

    tsm security maestro-rserve-ssl disable

Additional Rserve configuration (optional)

You can create a file named Rserv.cfg to set default configuration values to customize Rserve and place it in the /etc/Rserve.conf installation location. To improve stability with the Rserve server and Tableau Prep Builder, you can add additional values to your Rserve configuration. When you launch Rserve you can refer to this file to apply your configuration options. For example:

  • Windows: Rserve(args="--RS-conf C:\\folder\\Rserv.cfg")
  • MacOS and Linux: Rserve(args=" --no-save --RS-conf ~/Documents/Rserv.cfg")

The following example shows some additional options you can include in your Rserve.conf configuration file:

# If your data includes characters other than ASCII, make it explicit that data should be UTF8 encoded.
encoding utf8 
# Disable interactive behavior for Rserve or Tableau Prep Builder will stall when trying to run the script as it waits for an input response.
interactive no

For information about setting up an Rserve.conf file, see the Advanced Rserve configuration section in the R Implementation notes(Link opens in a new window) (community post).

Create your R script

When you create your script, include a function that specifies a data frame as an argument of the function. This will call your data from Tableau Prep Builder. You will also need to return the results in a data frame using supported data types.

For example:

postal_cluster <- function(df) {      
  out <- kmeans(c(df$Latitude, df$Longitude), 3, iter.max=10)
  return(data.frame(Latitude=df$Latitude, Longitude=df$Longitude, Cluster=out$cluster))

The following data types are supported:

Data type in Tableau Prep Builder Data type in R
String Standard UTF-8 string
Decimal Double
Int Integer
Bool Logical
Date String in ISO_DATE format “YYYY-MM-DD” with optional zone offset. For example, “2011-12-03+01:00” is a valid date.
DateTime String in ISO_DATE_TIME format “YYYY-MM-DDT:HH:mm:ss” with optional zone offset. For example, “2011-12-03T10:15:30+01:00” is a vslid date.

Note: Date and DateTime must always be returned as a valid string. Native Date (DateTime) types in R aren't supported as returned values but can be used in the script.

If you want to return different fields than what you input, you'll need to include a getOutputSchema function in your script that defines the output and data types. Otherwise, the output will use the fields from the input data, which are taken from the step that is prior to the script step in the flow.

Use the following syntax when specifying the data types for your fields in the getOutputSchema:

Function in R Resulting data type
prep_string () String
prep_decimal () Decimal
prep_int () Integer
prep_bool () Boolean
prep_date () Date
prep_datetime () DateTime

The following example shows the getOutputSchema function for the postal_cluster script:

getOutputSchema <- function() {      
  return (data.frame (
    Latitude = prep_decimal (),
    Longitude = prep_decimal (),
    Cluster = prep_int ()));

Connect to your Rserve server

Important: Starting in version 2020.3.3, configure your server connection once from the top Help menu instead of setting up your connection per flow in the Script step by clicking Connect to Rserve Server and entering your connection details. You will need to reconfigure your connection using this new menu for any flows that were created in an older version of Tableau Prep Builder that you open in version 2020.3.3.

  1. Select Help > Settings and Performance > Manage Analytics Extension Connection.
  2. In the Select an Analytics Extension drop-down list, select Rserve.

  3. Enter your credentials:
    • Port 6311 is the default port for plaintext Rserve servers.
    • Port 4912 is the default port for SSL-encrypted Rserve servers.
    • If the server requires credentials, enter a Username and Password.
    • If the server uses SSL encryption, select the Require SSL check box, then click the Custom configuration file link to specify a certificate for the connection.

      Note: Tableau Prep Builder doesn't provide a way to test the connection. If there is a problem with the connection an error message shows when you try and run the flow.

Add a script to your flow

Start your Rserve server then complete the following steps:

  1. Open Tableau Prep Builder and click the Add connection button.

  2. From the list of connectors, select the file type or server that hosts your data. If prompted, enter the information needed to sign in and access your data.

  3. Click the plus icon, and select Add Script from the context menu.

  4. In the Script pane, under Connection type , select Rserve.

  5. In the File Name section, click Browse to select your script file.
  6. Enter the Function Name then press Enter to run your script.

Thanks for your feedback! There was an error submitting your feedback. Please try again.