Don’t Be Scared of Relationships

The new data modeling capabilities create flexible data sources built around relationships.

Relationships combine data from different tables by looking at what columns (fields) those tables have in common and using that information to bring information from each table together in the analysis.

Unlike joins or unions, relationships form a data source without flattening multiple tables into a single table. Because of this, related data sources know which table each field is from. That means each field keeps its context, or level of detail. Related data sources can therefore handle tables with different granularity without issues of duplication or data loss.

In a related data source, the joins aren’t fixed up front. Instead of merging all the data (and having to work with all the data regardless of what each viz requires), only the relevant data is combined—per sheet and as necessary. As you drag and drop, Tableau evaluates the relationships of the relevant fields and tables. Those relationships are used to write queries with the correct join types, aggregations, and null handling.

You can think about how the data fits together and what questions you want to answer, rather than how to combine the data or compensate for artifacts from the data source.

Relationships don’t replace the previous ways of combining data: joins, unions, and blends. Rather, relationships are the new, flexible way to bring data together from multiple sources. The existing methods aren’t going away, and are still useful in specific scenarios.

Note: For related information on how relationship queries work, see these Tableau blog posts:

Also see video podcasts on relationships from Action Analytics(Link opens in a new window), such as Why did Tableau Invent Relationships?(Link opens in a new window) Click "Video Podcast" in the Library(Link opens in a new window) to see more.

Use Relationships

This topic walks through building a related data source and using one for analysis. If you would like to follow along, you can download the Bookshop data set(Link opens in a new window). Choose Bookshop.xlsx for the raw data to start from scratch, or MinimalBookshop.tdsx to start with the related data source basics configured for you.

Note: Relationships are available in Tableau 2020.2 and later.

Video: Migrated Data

If you open an existing workbook from a previous version of Tableau (version 2020.1 or earlier) in Tableau 2020.2 or later, your data source tab might show a Migrated Data table.

There are multiple videos embedded in this topic. For your convenience, a transcript is provided in the expandable section underneath each video.

Migrated Data video transcript

Video: Performance Options

Note: The interface for editing relationships shown in this video differs slightly from the current release but has the same functionality.

Performance Options video transcript

For more information, see Cardinality and Referential Integrity.

Example: Bookshop Data

We’ll do some analysis with this data source in a moment, so let’s talk about the data.

The data set is around (fictitious) books, and it’s important to consider the distinction between a book and an edition. A book is the conceptual work itself, with attributes such as the title, author, and genre. There are also editions of a book, with attributes such as a price and format (hardcover or paperback), that are identified with an ISBN. An edition of a book has a publisher and a page count, whereas a book may have won an award or be in a series.

You can follow along by downloading MinimalBookshop.tdsx, or be adventurous and build the model yourself from Bookshop.xlsx. Note that you can hide many of the ID fields once the tables are combined.

A database schema for book data.

Video: Work with multiple tables

Note: The interface for editing relationships shown in this video differs slightly from the current release but has the same functionality.

Work with multiple tables video transcript

Video: Basic calculations

Basic calculations video transcript

Video: Sets and groups

Sets and groups video transcript

Let’s do a comparison of how analysis looks between relationships and joins. We recommend that you download the companion workbook rather than continue with your own data source from above.

Question: For authors with books in a series, who is has the most book tour events?

Video: Work with different levels of details

Video transcript

Let’s look at which authors we’ll be working with. Using the related data source, we’ll bring Author Name and Series Name to Rows.

Because related data sources pare down unmatched data in tables that only contain dimensions, we’re focusing on the data that is relevant to us. We can see that there are seven authors, two of whom have written for two series. If you don’t want to see only matched values, you can restore those unmatched values (which is more like the legacy behavior) by going to the Analysis menu > Table Layout and checking Show Empty Rows.

Joined

Using the related data source:

Bring the set In a series from the Book table to the filter shelf. The default is to filter only to members in the set.
Bring Author Name to Rows.
Bring Book Tour Events to Columns.

Using the joined data source:

Bring the set In a series to the filter shelf.
Bring Author Name to Rows.
Bring Book Tour Events to Columns.

Our numbers here look a bit off.

For the joined data, we know there’s duplication because of the join, and we also know some authors span series. Because of this, we can’t just change the aggregation to, say, MIN or MAX, because we lose information for authors with works in multiple series.

What we really want is the number of events per series, visualized by author. This is a classic case for Level of Detail (LOD) expressions. We’ll create a calculation Series Events:

{FIXED [Series Name] : MIN ([Book Tour Events])}

Note that the MIN is to handle duplication of events for a single series.

Now if we bring this new field to Columns instead of the original events field, we’ll get the correct values.

For the related data, we didn’t need to do any of that. Relationships are smart enough to understand the native level of detail and the way the Author table relates through the Book and Info tables through to the Series table, and to correctly join and aggregate the Events measure back to Author Name—all without having to write LOD calculations.

So don’t be scared of relating your data. See for yourself! You can download the Bookshop data(Link opens in a new window) or use some of your own. Try filtering, using table calculations, building a variety of chart types, configuring performance options, and pushing relationships as far as you can.

View Underlying Data often to verify what data a mark represents.
If you aren’t required to join—and there are reasons you may need to—using a relationship provides more flexibility.
If you don’t want to see only matched values, you can restore those unmatched values by going to the Analysis menu > Table Layout > Show Empty Rows.

Ready to tackle calculations with Relationships? Check out Don’t Be Scared of Calculations in Relationships.

Ready to keep exploring how to do complex analysis with Relationships? Check out Don’t Be Scared of Deeper Relationships.

For more information on the technical underpinnings of relationships straight from the Product Management team, check out the series on relationships on the Tableau Blog.

Tableau Desktop and Web Authoring Help