Choosing a Predictive Model

Predictive modelling functions support linear regression, regularised linear regression and Gaussian process regression. These models support different use cases and prediction types, as well as having different limitations.

Supported models

Linear regression

Linear regression(Link opens in a new window) (also known as ordinary least squares regression, or OLS) is best used when there are one or more predictors that have a linear relationship between the prediction and the prediction target, they aren't affected by the same underlying conditions, and they don't represent two instances of the same data (for example, sales expressed in both dollars and euros). Linear regression is the default model for predictive modelling functions in Tableau; if you don't specify a model, linear regression will be used. You can explicitly specify this model by including "model=linear" as the first argument in your table calculation.

Example:

MODEL_QUANTILE(
"model=linear",
0.5,
SUM([Sales]),
ATTR(DATETRUNC('month',([Order Date]))
)

Regularised linear regression

Regularised linear regression(Link opens in a new window) is best used when there's an approximate linear relationship between two or more independent variables – also known as multicollinearity(Link opens in a new window). This is frequently observed in real-world data sets. To use this model instead of the default linear regression, include "model=rl" as the first argument in your table calculation.

Example:

MODEL_QUANTILE(
"model=rl",
0.5,
SUM([Sales]),
ATTR(DATETRUNC('month',([Order Date]))
)

Gaussian process regression

Gaussian process regression(Link opens in a new window) is best used when generating predictions across a continuous domain, such as time or space, or when there is a nonlinear relationship between the variable and the prediction target. Gaussian process regression in Tableau must have a single ordered dimension as a predictor but may include multiple unordered dimensions as predictors. Note that measures can't be used as predictors in Gaussian process regression in Tableau. To use this model instead of the default linear regression, include "model=gp" as the first argument in your table calculation.

Note: An ordered dimension is any dimension whose values can be sequenced, such as MONTH. An unordered dimension is any dimension whose values don't have an inherent sequence, such as gender or colour.

Example:

MODEL_PERCENTILE(
"model=gp",
AVG([Days to Ship Actual]),
ATTR(DATETRUNC('month',([Order Date])))
)


As a simple heuristic, you can use the below criteria for selecting your model:

  • Linear regression (default): Use when you have only one predictor which has a linear relationship with your target metric.

  • Regularised linear regression: Use when you have multiple predictors, especially when those predictors have a linear relationship to the target metric and those predictors are likely affected by similar underlying relationships or trends.

  • Gaussian process regression: Use when you have time or space predictors, or when you're using predictors that might not have a linear relationship with the target metric.

Thanks for your feedback!Your feedback has been successfully submitted. Thank you!