Analyzed Fields in Explain Data

When you run Explain Data on a mark, a statistical analysis is run on the aggregated mark, and then on potentially related fields from the data source that aren't represented in the current view.

Explain Data might not include every column from the data source in the analysis. In many cases, certain types of fields will be automatically excluded from the analysis. For more information, see Fields excluded by default.

Note: Dimensions with more than 500 unique values won't be considered for analysis (unless allowed by the author in Explain Data Settings).

All users can view information on which fields are included or excluded in the current analysis. Creators and Explorers who have editing permissions can edit the fields used by Explain Data for statistical analysis.

View fields analyzed by Explain Data

When you expand an explanation for a measure that is contributing to the value of the mark, a link that indicates the number of fields considered in the analysis is displayed at the bottom of the Explain Data pane.

Click the link to see the list of fields included in or excluded from the current statistical analysis.

When a data source contains more than 1000 unvisualized dimensions or measures, you might see an alert asking if you want Explain Data to consider more fields. Click Explain All to run an analysis that includes more fields. The analysis may take longer to complete.

To view fields used by Explain Data for statistical analysis

  1. Run Explain Data on a mark(Link opens in a new window).
  2. In the Explain Data pane, under Contributing to the value of, click a measure name.



  3. Click the number-of-fields link at the bottom of the pane.



Change fields used for statistical analysis

Creators and Explorers who have editing permissions can select fields to be included or excluded from the statistical analysis in the Fields tab of the Explain Data Settings dialog box.

When a data source contains dimensions with a large number of unique values (up to 500), those fields won't be considered for analysis.

To edit the fields used by Explain Data for statistical analysis

Settings for analyzed fields are applied at the data source level.

  1. Run Explain Data on a mark when editing a view.
  2. In the Explain Data pane, click the settings icon at the bottom of the Explain Data pane. Or, click the Edit button in the Analyzed Fields view (how to open analyzed fields).


  3. In the Explain Data Settings dialog box, click the Fields tab.
  4. Click a drop-down arrow next to a field name, select Automatic or Never Include, and then click OK.

    Note that fields must have less than 500 unique values to be included in the analysis.


Fields excluded by default

Fields excluded by default Reasons for exclusion

All unvisualized measures when there are more than 1,000 measures in the data source.

All unvisualized dimensions when there are more than 1,000 dimensions in the data source.

Computing explanations for more than 1000 unvisualized measures or dimensions can take longer to compute, sometimes several minutes. These fields are excluded by default for initial analysis, but you can choose to include them for further analysis.

In this situation, you might see an alert asking if you want Explain Data to consider more fields. Click the alert link to get more information. Click Explain All to run an analysis that includes more fields.

Fields that use geometry, latitude, or longitude Geometry, latitude, or longitude by themselves can never be explanations. It is highly likely that an explanation that calls out the latitude or the longitude as an explanation is due to a spurious correlation and not a probable explanation.
Dimensions with high cardinality (dimensions with > 500 members)

High cardinality dimensions take longer to compute. Dimensions with more than 500 unique values will not be considered for analysis.

Groups, bins, or sets Not currently supported.
Table calculations Table calculations cannot be analyzed when table calculations are at a different level of detail than the view.
Unvisualized measures that can't be averaged Unvisualized measures that can't be averaged include measures that are calculated fields where the calculation expression includes aggregations (display as AGG() fields when added to the sheet).
Discrete measures and continuous dimensions Not currently supported.
Hidden fields Not available.
Calculated fields with errors No values present to analyze.
Thanks for your feedback!