Analyzed Fields in Explain Data

Explain Data runs a statistical analysis on a dashboard or sheet to find marks that are outliers, or specifically on a mark you select. The analysis also considers possibly related data points from the data source that aren't represented in the current view.

Explain Data might not include every column from the data source in the analysis. In many cases, certain types of fields will be automatically excluded from the analysis. For more information, see Fields excluded by default.

Explain Data view analyzed fields.

Note: Dimensions with more than 500 unique values won't be considered for analysis (unless allowed by the author in Explain Data Settings).

All users can view information on which fields are included or excluded in the current analysis. Creators and Explorers who have editing permissions can edit the fields used by Explain Data for statistical analysis.

View fields analyzed by Explain Data

When you expand an explanation for a measure that is contributing to the value of the mark, a link that indicates the number of fields considered in the analysis is displayed at the bottom of the Data Guide pane.

Explain Data details with a link to view the 14 of 23 fields that are relevant to the higher sum value.

Click the link to see the list of fields included in or excluded from the current statistical analysis.

When a data source contains more than 1000 unvisualized dimensions or measures, you might see an alert asking if you want Explain Data to consider more fields. Click Explain All to run an analysis that includes more fields. The analysis may take longer to complete.

To view fields used by Explain Data for statistical analysis

  1. Run Explain Data on dashboard, sheet, or mark(Link opens in a new window).
  2. In the Data Guide pane, under Contributing to the value of, click a measure name.

    Explain Data details about an analyzed mark, with text highlighted that describes a higher than expected trip distance sum value of 8,800.

  3. Click the number-of-fields link at the bottom of the pane.

    Explain Data details about an analyzed mark, with text highlighted that reads "12 of 23 fields".

Change fields used for statistical analysis

Creators and Explorers who have editing permissions can select fields to be included or excluded from the statistical analysis in the Fields tab of the Explain Data Settings dialog box.

Explain Data Settings window that provides information about the data settings, including fields and data types.

When a data source contains dimensions with a large number of unique values (up to 500), those fields won't be considered for analysis.

To edit the fields used by Explain Data for statistical analysis

Settings for analyzed fields are applied at the data source level.

  1. Run Explain Data on a mark when editing a view.
  2. In the Data Guide pane, click the settings icon at the bottom of the pane. Or, click the Edit button in the Analyzed Fields view (how to open analyzed fields).

    Edit button on the Analyzed Fields pane.
  3. In the Explain Data Settings dialog box, click the Fields tab.
  4. Click a drop-down arrow next to a field name, select Automatic or Never Include, and then click OK.

    Note that fields must have less than 500 unique values to be included in the analysis.

    Explain Data Settings window with a measure selected and the dropdown menu expanded to reveal options for including or excluding the field.

Fields excluded by default

Fields excluded by default Reasons for exclusion

All unvisualized measures when there are more than 1,000 measures in the data source.

All unvisualized dimensions when there are more than 1,000 dimensions in the data source.

Computing explanations for more than 1000 unvisualized measures or dimensions can take longer to compute, sometimes several minutes. These fields are excluded by default for initial analysis, but you can choose to include them for further analysis.

In this situation, you might see an alert asking if you want Explain Data to consider more fields. Click the alert link to get more information. Click Explain All to run an analysis that includes more fields.

Fields that use geometry, latitude, or longitude Geometry, latitude, or longitude by themselves can never be explanations. It is highly likely that an explanation that calls out the latitude or the longitude as an explanation is due to a spurious correlation and not a probable explanation.
Dimensions with high cardinality (dimensions with > 500 members)

High cardinality dimensions take longer to compute. Dimensions with more than 500 unique values will not be considered for analysis.

Groups, bins, or sets Not currently supported.
Table calculations Table calculations cannot be analyzed when table calculations are at a different level of detail than the view.
Unvisualized measures that can't be averaged Unvisualized measures that can't be averaged include measures that are calculated fields where the calculation expression includes aggregations (display as AGG() fields when added to the sheet).
Discrete measures and continuous dimensions Not currently supported.
Hidden fields Not available.
Calculated fields with errors No values present to analyze.