Learning Tableau 2020
上QQ阅读APP看书,第一时间看更新

Visualizing dates and times

In your analysis, you will often want to understand when something happened. You'll ask questions like the following:

  • When did we gain the most new customers?
  • Is profit trending up or down?
  • What times of day have the highest call volume?
  • What kinds of seasonal trends do we see in sales?

Fortunately, Tableau makes this kind of visual discovery and analysis easy. In this section, we'll look at how Tableau works with dates and some various ways you might visualize time.

Date parts, date values, and exact dates

When you are connected to a flat file, relational, or extracted data source, Tableau provides a robust built-in date hierarchy for any date field.

Cubes/OLAP connections do not allow Tableau hierarchies. You will want to ensure that all date hierarchies and date values you need are defined in the cube.

To see this in action, continue with the Chapter 3 workbook, navigate to the Built-in Date Hierarchy sheet, and create a view similar to the one that was shown by dragging and dropping Number of Patient Visits to Rows and Date of Admit to Columns. The YEAR(Date of Admit) field on Columns will have a plus sign indicator, like this:

Figure 3.20: The field representing the Year part of the date hierarchy

You'll also find a plus or minus indicator as you hover over headers, like this:

Figure 3.21: A plus icon on the column headers that could be used to expand the hierarchy

When you click it, the hierarchy expands by adding QUARTER(Date of Admit) to the right of the YEAR(Date of Admit) on Columns, and the view is expanded to the new level of the hierarchy:

Figure 3.22: The expanded hierarchy with the year and quarter shown

The YEAR(Date of Admit) field now has a minus sign indicator that allows you to collapse the hierarchy back to the year level. The QUARTER field also has a plus sign, indicating that you can expand the hierarchy further. Starting with Year, the hierarchy flows as follows: Year | Quarter | Month | Day. When the field is a date and time, you can further drill down into Hour | Minute | Second. Any of the parts of the hierarchy can be moved within the view or removed from the view completely.

The hierarchy is made up of Date Parts, which is one of the three ways a date field can be used. When you right-click the date field in the view or by using the drop-down menu, you'll see multiple date options, as follows:

Figure 3.23: The drop-down menu on an active date field demonstrates the various aspects of dates in Tableau

The three major date types are evident, though not explicitly labeled, in the menu:

  • Date part: This field will represent a specific part of the date, such as the quarter or month. The part of the date is used by itself and without reference to any other part of the date. This means that a date of November 8, 1980, when used as a month date part, is simply November in the view. The November that's selected in the view here represents all of the Novembers in the dataset, while the number of patient visits is the total for both 2018 and 2019:

    Figure 3.24: This view uses Month as a date part. The number of patient visits is the total for the month, without regard to the year

  • Date value: This field will represent a date value, but rolled up or truncated to the level you select. For example, if you select a date value of Month, then November 8, 2019 gets truncated to the month and year, and is November 2019. You'll notice that November 2018 and November 2019 each have a separate value in the header and a distinct bar:

    Figure 3.25: This view uses Month as a date value. The number of patient visits is the total for the month with regard to the year

  • Exact date: This field represents the exact date value (including time, if applicable) in the data. This means that November 8, 1980, 2:01 am is treated as distinct from November 8, 1980, 3:08 pm.

It is important to note that nearly any of these options can be used as discrete or continuous fields. Date parts are discrete by default. Date values and exact dates are continuous by default. However, you can switch between discrete and continuous as required to allow flexibility in the visualization.

For example, you must have an axis (requiring a continuous field) to create a reference line. Also, Tableau will only connect lines at the lowest level of row or column headers. Using a continuous date value instead of multiple discrete date parts will allow you to connect lines across multiple years, quarters, and months.

As a shortcut, you can right-click and then drag and drop a date field into the view to get a menu of options for how the date field should be used prior to the view being drawn.

Let's next consider some various ways we might visualize dates and times.

Variations of date and time visualizations

The ability to use various parts and values of dates and even mix and match them gives you a lot of flexibility in creating unique and useful visualizations.

For example, using the month date part for columns and the year date part for color gives a time series that makes a visual year-over-year comparison quite easy. The year date part has been copied to the label so that the lines can be labeled:

Figure 3.26: The comparison of two years, month-by-month

This kind of view allows for easy year-over-year comparison.

Clicking on any of the shelves on the Marks card will give you a menu of options. Here, Label has been clicked, and the label was adjusted to show only at the end of each line.

The following heat map is another example of using date parts on different shelves to achieve useful analysis. This kind of visualization can be quite useful when looking at patterns across different parts of time, such as hours in a day, or weeks in a month. Here, we are looking at how many patients were admitted by month and day:

Figure 3.27: A heat map showing the intensity of patient visits by day and month

The year has not been included in the view, so this is an analysis of all years in the data and allows us to see whether there are any seasonal patterns or hotspots. We might notice patterns related to epidemics, doctors' schedules, or the timing of insurance benefits. Perhaps the increased intensity of patient admissions in February corresponds to the flu season.

Observe that placing a continuous field on the Color shelf resulted in Tableau completely filling each intersection of Row and Column with the shade of color that encoded the sum of patient visits. Clicking on the Color shelf gives us some fine-tuning options, including the option to add borders to marks. In this view, a black border has been added to help distinguish each cell.

Gantt charts

Gantt charts can be incredibly useful for understanding any series of events with a duration, especially if those events have some kind of relationship. Visually, they are very useful for determining whether certain events overlap, have dependency, or take more or less time than other events.

As an example (not included in the workbook), the following Gantt chart shows a series of processes that run when an application is started. Some of these processes run in parallel, and some are clearly dependent on others. The Gantt chart makes these dependencies clear:

Figure 3.28: A Gantt chart showing the time each process started and how long each took

Gantt charts use the Gantt mark type on the Marks card drop-down. A Gantt bar mark starts at the value that was specified by the field on Rows that defines the axis. The length of the Gantt bar is then determined by the field on the size card, with positive values stretching to the right and negative values to the left.

At the hospital, you might want to see each patient visit to the ER in 2019 and understand how long each visit lasted, whether any patients returned to the hospital, and how much time there was between visits. The following steps give an example of how you might create a Gantt chart with steps like these:

  1. Place Department on Filters and keep only ER.
  2. Place Date of Admit on Filters, select Years as the option for filtering, and keep only 2019.
  3. Place Date of Admit on Columns as a continuous Exact Date or as a Day value (not Day part). Notice that Tableau's automatic default for the mark type is Gantt bars:

    Figure 3.29: In this case, Gantt bars are the automatic mark type.

  4. Place Doctor and Patient Name on Rows. The result is a row for each patient grouped by each doctor. A Gantt bar represents a stay in the hospital.

    In most cases, we'd also want to add a unique identifier to the view, such as Patient ID, to ensure that patients who happen to share the same name are distinguished in the visualization. This is not necessary with this dataset, as all names happen to be unique, but it may be vitally important when you work with your data.

  5. The length of the Gantt bar is set by placing a field with a value of duration on the Size shelf. There is no such field in this dataset. However, we have the Date of Discharge, and we can create a calculated field for the duration. We'll cover calculations in more detail in the next chapter. For now, select Analysis from the menu and click Create Calculated Field.... Name the field Days in the Hospital and enter the following code:
    DATEDIFF('day', [Date of Admit], [Date of Discharge]) 
    
  1. The new calculated field will appear under Measures in the data pane. Drag and drop the field onto the Size shelf. You now have a Gantt chart showing when patients were admitted and how long each visit lasted.

Consider sorting the Patient Name dimension in the view. For example, sorting by field and selecting Date of Admit as minimum would enable you to see patients who were admitted earlier towards the top and patients who were admitted later towards the bottom. It is based on the earliest (minimum) date of admission for the patient, even if they were admitted multiple times. Sorting can be a very useful technique for seeing patterns in the Gantt chart.

Your final view should look something like this:

Figure 3.30: The final Gantt chart, showing each patient, when they were admitted, how long they stayed, and whether they ever returned

This type of chart can be very useful in seeing patterns and relationships between entities over time.

When plotted on a date axis, the field defining the length of Gantt bars always needs to be in terms of days. If you want to visualize events with durations that are measured in hours or seconds, avoid using the day argument for DATEDIFF because it computes whole days and loses precision in terms of hours and seconds.

Instead, calculate the difference in hours or seconds and then convert back to days. The following code converts the number of seconds between a start and end date, and then divides by 86,400 to convert the result into days, including fractional parts of the day: DATEDIFF('second', [Start Date], [End Date]) / 86400.

With a good understanding of how Tableau works with dates and times, we've considered some different options for visualization. Let's turn next to focus on how to visualize parts-to-whole relationships.