Analysis with the Second Date in Tableau Desktop

This is the second stage of the tutorial and assumes the first stage, Finding the Second Date with Tableau Prep, has been completed.

In the first stage, we took our original data set and shaped it to answer the following questions:

  1. What was the length of time in days between the first and second infraction for each driver?

  2. Compare the fine amounts for the first and second infractions. Are they correlated?

  3. Which driver paid the most overall? Who paid the least?

  4. How many drivers had multiple infraction types?

  5. What was the average fine amount for drivers who never attended traffic school?

As we now explore these questions, it becomes clear that there are some pros and cons to the first data structure we created. We'll go back into Tableau Prep Builder and do some additional reshaping, then see how that impacts the same analysis in Tableau Desktop. Finally, we'll look at a Tableau Desktop-only approach to the analysis using Level of Detail (LOD) expressions with the original data.

The goal of this tutorial is to present various concepts in the context of a real-life scenario and work through options—not prescriptively establishing which is best. At the end, you should have a better sense of how data structure impacts calculations and analysis, as well as greater familiarity with various aspects of Tableau Prep and calculations in Tableau Desktop.

Note: To complete the tasks in this tutorial, you need Tableau Prep Builder and optionally Tableau Desktop installed and the data downloaded.

To install Tableau Prep and Tableau Desktop before continuing with this tutorial, see the Tableau Desktop and Tableau Prep Deployment guide(Link opens in a new window). Otherwise you can download the Tableau Prep(Link opens in a new window) and Tableau Desktop(Link opens in a new window) free trials.

The data set is the output from Driver Infractions.tflx, as built in the first stage.

Analysis in Tableau Desktop

Now that we have our data configured, we'll bring it into Tableau Desktop. We can easily answer some questions, but others involve a few (or a lot of) calculations. Try your hand at the questions below; you can expand each one for basic information about how to proceed if you get stuck.

Note: You can download the workbook Driver Infractions.twbx(Link opens in a new window) to look at the solutions in context. Remember that there may be alternative ways to interpret the analysis or pursue answers.

1. What was the length of time in days between the first and second infraction for each driver?
2. Compare the fine amounts for the first and second infractions. Are they correlated?
3. Which driver paid the most overall? Who paid the least?
4. How many drivers had multiple infraction types?
5. What was the average fine amount for drivers who never attended traffic school?

Go Further—Pivoted Data

While the data we've been working with is well structured to address questions specifically around first and second infractions, it isn't the standard structure recommended for use with Tableau Desktop. The more our analysis diverges from basic questions around the infraction dates, the more complicated our calculations become to combine the relevant information into useable form.

Usually, when data is stored with multiple columns for the same type of data (such as two columns for date, two columns for fine amount, etc.) and unique information is stored in the field name (such as whether it's the first or second infraction), this is an indication the data should be pivoted.

Performing a multiple pivot in Tableau Prep Builder can handle this nicely. We can work from the end of the Driver Infraction Tableau Prep flow created in the previous tutorial Finding the Second Date with Tableau Prep.

Tip: Make sure you're back in Tableau Prep for these next steps.

  1. From the final clean step, add a Pivot step that pivots by every duplicated field. Use the plus icon in the upper right corner of the Pivoted Fields area to add more Pivot Values. Each set of fields (such as 1st and 2nd Fine Amounts) should be pivoted together.

    For more information about pivoting, see Clean and Shape Data.

  2. In the Pivoted Fields area, under the Pivot1 Names column, double click each value and rename them to 1st and 2nd.

    Pivot 1 configuration pane in Tableau Prep showing the five pivots

The results can be tidied by removing null dates as well as renaming and reordering fields.

  1. Add a cleaning step after the pivot. In the Infraction Date column, right-click on the null bar and choose Exclude.

  2. Double-click the field name Pivot1 Names and rename it Infraction Number.

  3. Drag fields as appropriate to reorder them as below:

    Preview of the pivoted and cleaned up data

  1. From the new, pivoted data, create an output named Pivoted Driver Infractions and bring it into Tableau Desktop. (Don't forget to run the flow after adding the Output step.)

Now we can look at our five questions again with this pivoted data structure; you can expand each one for basic information about how to proceed if you get stuck.

Note: You can download the completed flow file Pivoted Driver Infractions.tflx to check your work, or download the workbook Pivoted Driver Infractions.twbx to look at the solutions in context. Remember that there may be alternative ways to interpret the analysis or pursue answers.

1. What was the length of time in days between the first and second infraction for each driver?
2. Compare the fine amounts for the first and second infractions. Are they correlated?
3. Which driver paid the most overall? Who paid the least?
4. How many drivers had multiple infraction types?
5. What was the average fine amount for drivers who never attended traffic school?

The benefits of pivoted data

We could stick with the original data structure from the tutorial if we know we'd only need to answer questions that are easy to answer with that structure. However, the pivoted data format is more flexible. Even though it requires some calculations, once they're in place the resulting data set is well suited to answer broader questions.

Go Further Still—Calculations Only

What if you don't have access to Tableau Prep Builder? Are you out of luck entirely if you're stuck with the original data? Not at all!

Tableau Desktop and LOD expressions can answer all of our analytical questions. If we connect to the original Traffic Violations.xlsx(Link opens in a new window), it looks very similar to the pivoted data set—just without the crucial Infraction Number field. We'll need to mimic the outcome of the aggregation steps via LOD expressions.

Note: You can download the workbook LOD Driver Infractions.twbx(Link opens in a new window) to look at the solutions in context. Remember that there may be alternative ways to interpret the analysis or pursue answers.

1. What was the length of time in days between the first and second infraction for each driver?
2. Compare the fine amounts for the first and second infractions. Are they correlated?
3. Which driver paid the most overall? Who paid the least?
4. How many drivers had multiple infraction types?
5. What was the average fine amount for drivers who never attended traffic school?

It's important to remember that this solution has a lot of nested calculations and LOD expressions. Depending on the size of the data set and the complexity of the data, performance could be an issue.

Reflection on Methods

So which route should you go? That's entirely up to you and the tools at your disposal.

  • If you want to steer clear of LODs, there's a data-shaping solution, though calculations might be necessary for some analysis (Analysis in Tableau Desktop).

  • If you can shape the data and are comfortable with calculations—including LODs—the middle-of-the-road option provides the best flexibility (Go Further—Pivoted Data).

  • If you're comfortable with LODs, there's minimal impact on performance, and/or you don't have access to Tableau Prep, solving this with LODs alone is a viable option (Go Further Still—Calculations Only).

At the very least, it's valuable to understand how aggregation in Tableau Prep and Level of Detail expressions in Tableau Desktop are interrelated and impact data analysis. As with most things in Tableau, there's more than one way to do anything. Exploring all the various options can help bring concepts together and let you pick the best solution for you.

Calculations used:

Note: Special Thanks to Ann Jackson's Workout Wednesday topic Do Customers Spend More on Their First or Second Purchase?(Link opens in a new window) and Andy Kriebel's Tableau Prep Tip Returning the First and Second Purchase Dates(Link opens in a new window) that provided the initial inspiration for this tutorial. Clicking these links will take you away from the Tableau website. Tableau cannot take responsibility for the accuracy or freshness of pages maintained by external providers. Contact the owners if you have questions regarding their content.