When to Use a Multi-fact Relationship Model
A multi-fact relationship model is a data model that lets you add unrelated tables in a single data source and then uses related fields during visual analysis to essentially stitch the tables together based on the context. Unlike blending, the data exists within a single data source – the concepts of primary and secondary data sources don't apply and no data is dropped from left joins. Unlike a single table data model, multiple base tables maintain their own context regarding tables shared between them. A multi-fact relationship data model gives you more options for performing multi-fact analysis in Tableau.
Imagine you want to analyse how weather and ice cream sales trend together. Weather and ice cream sales both happen at specific times and specific places, but there's no direct connection between ice cream sales and weather. These are unrelated pieces of data that both relate to the shared concepts of date and location.
This question lends itself to creating a multi-fact relationship model. Ice Cream Sales and Weather can each be added as a base table and related on Date and Location, which are shared tables.
Why did we build the capability to model unrelated tables?
Analysis often involves bringing together tables of data that don't even have a direct relationship to each other but that both relate to the same, common information (such as date or location). A multi-fact relationship model supports loose semantic coupling by introducing the concept of degrees of relatedness and the ability to build a data model with multiple, unrelated base tables.
Semantic coupling is a term used to describe how tightly combined data is. A join or a union is a tight semantic coupling; they bring multiple tables together into a new physical table that then acts as a single table. A relationship is a looser coupling between tables that ties tables together logically, maintaining their distinct status as separate tables. Even further along the semantic coupling spectrum is data blending, where results from separate data sources are visually combined based on elements shared between both of them. A multi-fact relationship model is closer to the blending end of the spectrum, but within a single data source instead of across data sources.
A multi-fact relationship model – a data model with multiple base tables – permits unrelated tables in the model as long as shared tables exist in the model, too. During analysis, fields from a shared table "stitch" together otherwise unrelated tables of data based on the shared dimensions they have in common (such as happening in the same place or at the same time). All the benefits of relationships are maintained, including the retention of each table's grain or native level of detail.
Similar to a single base table data model, Tableau determines the best join type to use behind the scenes based on the structure of the viz. But in a multi-fact relationship model, the join options are expanded to include outer and cross joins to handle different levels of relatedness. For more information, see About Multi-fact Relationship Data Models.
Where did the name come from?
Multi-fact relationships get their name from multi-fact analysis. In a data warehouse model, data is stored in a central fact table surrounded by dimension tables. In this context, fact refers to measurements or metrics, which are numeric fields of data that capture facts about the data – Tableau's measures. Dimension tables contain attributes about these facts.
Schemas based on fact tables are often structured as a star or snowflake, depending on how the dimension tables are organised. When analysis needs to be performed across fact tables, this is called multi-fact analysis. Analysis is done in the context of the common dimension tables, known as shared dimensions or conformed dimensions. In Tableau, you build these data models using relationships, so we've named this suite of capabilities multi-fact relationships.
When to use multi-fact relationship data models
If your data consists of tables that are all related to each other, you can stick with single base table data sources built with relationships. A multi-fact relationship model is called for when your data spans different concepts, either in the form of multiple fact tables or different unrelated contexts.
Whenever possible, build your data sources with a single base table. In a single base table data model, every table is related and there is no need to consider degrees of relatedness. Only use multi-fact relationships when that data model structure is called for.
Multi-fact analysis
Multi-fact analysis is a core use case for multi-fact relationships in Tableau. In this example, Fact A and Fact B share a table Date.
To model this in Tableau, the fact tables become base tables and multiple incoming relationships are established for their shared dimension table.
Other scenarios
Multi-fact relationship data models aren't just for multi-fact analysis, however. Tableau doesn't require a strict definition of fact or dimension tables. Any table can be a base table (although it should suit the characteristics of base tables). Some scenarios that indicate that a multiple base table data source might be helpful include:
- Moving through stages, such as base tables for applications, transcripts and alumni events for a shared student table.
- Different contexts for the same events, such as base tables for the events of medical appointments and billing invoices, with shared tables to set the context to doctors or patients.
- Different domains that may correlate, such as scenarios that would previously be best handled with data blending, like ice cream sales and weather correlated through the shared tables of date and location.
Learn more about when multi-fact relationships are useful in this Tableau blog post: When and How to Use Multi-fact Relationships in Tableau.
Identify the base tables
In a multi-fact relationship model, directionality matters. That is to say, which tables are the base tables along the left side of the model and which tables are shared downstream impacts how the relationships are evaluated to return the analytical results.
Consider a conceptual bowtie of invoices, appointments, doctors and patients:
The correct way to build the data model in Tableau is with Invoices and Appointments as the base tables, and with Doctors and Patients as the shared tables (not with Doctors and Patients as the base tables).
Correct: Invoices and Appointments as the base tables | Incorrect: Doctors and Patients as the base tables |
Conceptually, a patient (or doctor) is the entity that stitches together the event of an appointment and the event of an invoice.
If your data model is backwards (such as with Doctors and Patients as the base tables instead of Appointments and Invoices), the outer join stitching behaviour won't be as useful. Your analysis might show a lot of table scoped measures and ambiguity. If you find yourself with ambiguously related fields that you weren't expecting, re-evaluate the tables you are using as base tables and see if your data model needs to be reversed.
Characteristics of base tables and shared tables
If you're performing multi-fact analysis, the fact tables become the base tables and any shared dimension tables are shared tables. Tableau doesn't require a strict adherence to fact and dimension table characteristics. However, there are certain attributes that can help you identify which tables should be base tables and which should be shared tables.
Base table | Shared table |
Fact tables in a data warehouse schema | Shared or conformed dimension tables in a data warehouse schema |
Specific to the context or analysis (flight information, energy usage) | Consistent concept across various contexts (date, location) |
Measure heavy | Primarily dimensions |
More frequently updated/transactional (medical appointments, prescriptions, vitals) | More stable/durable (doctor, patient) |
Has foreign key fields | Has primary key fields |
Event based (class schedule, grade on an assignment) | Entity based (student, classroom) |
Note that if there are intermediate tables between a base table and a shared table, you can swap which one is the base table without fundamentally altering the data model. (Such as Parlour Info and Ice Cream Sales in the first example.) What matters is which tables are upstream of the shared tables and which are shared.
Try an additional base table instead
There are various scenarios that may indicate you should build a multi-fact relationship model with multiple base tables rather than a single base table data source:
- If you're trying to build a data source with a cycle, the downstream table should be another base table instead.
- If you have a series of tables that are related on the same sets of relationship clauses (such as date and location), those dimensions should be pulled out and made into shared tables instead.
- This is especially useful because multiple relationship clauses must all be true (logically, an AND) for the tables to be related for those records.
- If, instead, you want to analyse records where one may be true at a time (a contextual OR), this flexibility is provided by setting up a data model with shared dimension tables instead.
- If you're using a blend but want to have an equivalent blend without primary and secondary data sources, build a data model that combines the data sources from the blend with their linking fields in a shared table or tables.