Predictive Modeling Functions
This article introduces predictive modeling functions and their uses in Tableau. It also demonstrates with an example how to create table calculations using the predictive modeling functions.
Why use predictive modeling functions
Predictive modeling functions can help you quickly generate predictions that can be manipulated, visualized, and exported like data using table calculations.
Before, you may have had to integrate Tableau with R and Python in order to perform advanced statistical calculations and visualize them in Tableau. Now, you can select targets and predictors by updating the variables and visualizing multiple models with different combinations of predictors. The data can be filtered, aggregated, and transformed at all levels of detail, with inputs and predictions automatically recalculated to match the data in the view.
For more information about predictive modeling functions in Tableau, see How Predictive Modeling Functions Work in Tableau
Predictive modeling functions available in Tableau
MODEL_PERCENTILE
Syntax | MODEL_PERCENTILE(
|
Definition | Returns the probability (between 0 and 1) of the expected value being less than or equal to the observed mark, defined by the target expression and other predictors. This is the Posterior Predictive Distribution Function, also known as the Cumulative Distribution Function (CDF). |
Example | MODEL_PERCENTILE( SUM([Sales]),COUNT([Orders])) |
MODEL_QUANTILE
Syntax | MODEL_QUANTILE(
|
Definition | Returns a target numeric value within the probable range defined by the target expression and other predictors, at a specified quantile. This is the Posterior Predictive Quantile. |
Example | MODEL_QUANTILE(0.5, SUM([Sales]), COUNT([Orders])) |
Create a prediction calculation
Follow along with the steps below to learn how to create a simple prediction calculation using the MODEL_QUANTILE function. For a more detailed example, see Example - Explore Female Life Expectancy with Predictive Modeling Functions
Step 1: Create a Visualization
- In Tableau Desktop, connect to the Sample - Superstore saved data source, which comes with Tableau.
- Navigate to a worksheet.
- From the Data pane, drag the Order Date dimension to the Columns shelf.
- Open the measure’s context menu to change its list level to Month and Year:
- Drag Sales to the Rows shelf.
Step 2: Create the Calculated Field
- Click to open the Analysis menu at the top, and then select Create Calculated Field.
- In the Calculation Editor, do the following:
- Name the calculation: Predict Median Sales.
- Enter the following formula:
MODEL_QUANTILE(0.5, SUM([Sales]),ATTR(DATETRUNC('month', [Order Date])))
Remember: The MODEL_QUANTILE function takes a given quantile and predicts values based on the predictors you input.
Let's break this down:
- In this case, the quantile = 0.5, which predicts the median.
- We want to predict sales, so the target expression is SUM([Sales]).
- We want to base the prediction on past performance, so we include date as a predictor, which is the last argument in the calculation.
- When finished, click OK.
The prediction calculation is now added as a calculated field in the Data pane.
Step 3: Add the Prediction Calculation to the View
- Drag the prediction calculation to the Rows shelf, to the right of SUM(Sales).
- Right-click (Control-click on Mac) the measure and select Dual Axis.
- To align the two axes in a dual axes chart to use the same scale, right-click (Control-click on Mac) the secondary axis, in this case Predict Median Sales, and select Synchronize Axis. This aligns the scale of the two axes.
That's all there is to it. To find out how you can extend a date axis and predict the future, see Predictive Modeling Functions in Time Series Visualizations.
Rules for Prediction Calculations
- You can’t mix aggregate and non-aggregate arguments. If the target expression is an aggregate, so must the predictor.
- The functions are best used to predict values for individual records, on vizzes where each mark represents a discrete entity, such as a person, a product, a sale, etc.
- The functions are best used to predict values for aggregated target expressions using SUM and COUNT.
- The functions aren’t recommended to predict values for aggregated target expressions using AVG, MEDIAN, MIN, or MAX.
- The functions should use predictors that are at the same level of detail or higher than the viz.