Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Analyzing Data with Power BI and Power Pivot for Excel (Alberto Ferrari, Marco Russo) (z-lib.org).pdf
Скачиваний:
11
Добавлен:
14.08.2022
Размер:
18.87 Mб
Скачать

Chapter 4. Working with date and time

In business models, you typically compute year-to-date (YTD), year-over-year comparisons, and percentage of growth. In scientific models, you might need to compute forecasts based on previous data or check the accuracy of numbers over time. Nearly all these models contain some calculations related to time, which is why we have dedicated a full chapter to these kinds of calculations.

In more technical terms, we say time is a dimension, meaning you typically use a Calendar table to slice your data by year, month, or day. Time is not just a dimension, however. It is a very special dimension that you need to create in the right way and for which there are some special considerations.

This chapter shows several scenarios and provides a data model for each one. Some examples are very simple, whereas others require very complex DAX code to be solved. Our goal is to show you examples of data models and to give you a better idea of how to correctly model date and time.

Creating a date dimension

Time is a dimension. A simple column in your fact table containing the date of the event is not enough. If, for example, you need to use the model shown in Figure 4- 1 to create a report, you will quickly discover that the date alone is not enough to produce useful reports.

FIGURE 4-1 The Sales table contains the Order Date column with the date of the order.

By using the date in Sales, you can slice values by individual dates. However, if you need to aggregate them by year or by month, then you need additional

columns. You can easily address the issue by creating a set of calculated columns directly in the fact table (although this is not an optimal solution because it prevents you from using time-intelligence functions). For example, you can use the following simple formulas to create a set of three columns—Year, Month Name, and Month Number:

Click here to view code image

Sales[Year] = YEAR ( Sales[Order Date] ) Sales[Month] = FORMAT ( Sales[Order Date], "mmmm" ) Sales[MonthNumber] = MONTH ( Sales[Order Date] )

Obviously, the month numbers are useful for sorting the month names in the correct way. When you include them, you can use the Sort by Column feature that is available in both Power BI Desktop and the Excel data model. As shown in Figure 4-2, these columns work perfectly fine to create reports that slice values, like the sales amount, by time.

FIGURE 4-2 The report shows sales sliced by date, by using calculated columns in the fact table.

However, there are a couple of issues with this model. For example, if you need to slice purchases by date, you end up repeating the same setup of the calculated columns for the Purchases table. Because the columns belong to the fact tables, you cannot use the years in Sales to slice Purchases. As you might recall from Chapter 3, “Using multiple fact tables,” you need a dimension to correctly

slice two fact tables at once. Moreover, you typically have many columns in a date dimension—for example, columns for fiscal years and months, holiday information, and working days. Storing all these columns in a single, easily manageable table is a great plus.

There is another more important reason to use dimensions. Using columns in the fact table makes the coding of time-intelligence calculations much more complex, whereas using a date dimension makes all these formulas much easier to write.

Let us elaborate on this concept with an example. Suppose you want to compute the YTD value of Sales Amount. If you can rely only on columns in the fact table, the formula becomes quite complicated, as in the following:

Click here to view code image

Sales YTD :=

VAR CurrentYear = MAX ( Sales[Year] )

VAR CurrentDate = MAX ( Sales[Order Date] ) RETURN

CALCULATE (

[Sales Amount],

Sales[Order Date] <= CurrentDate, Sales[Year] = CurrentYear,

ALL ( Sales[Month] ),

ALL ( Sales[MonthNumber] )

)

Specifically, the code needs to do the following:

1.Apply a filter on the date by filtering only the ones before the last visible date.

2.Keep a filter on the year, taking care to show only the last visible one in case there are multiple in the filter context.

3.Remove any filter from the month (in Sales).

4.Remove any filter from the month number (again, in Sales).

Note

If you are not familiar with DAX, gaining a deep understanding of why this formula works is a great mental exercise to become more familiar with the way filter context and variables work together.

This code works just fine, as shown in Figure 4-3. However, it is unnecessarily complex. The biggest problem with the formula is that you cannot leverage the built-in DAX functions that are designed to help you author time-intelligence calculations. In fact, those functions rely on the presence of a specific table dedicated to dates.

FIGURE 4-3 Sales YTD reports the correct values, but its code is too complicated.

If you update the data model by adding a date dimension, like the one in Figure 4-4, the formula becomes much easier to author.

FIGURE 4-4 Adding a date dimension to the model makes the code much easier to write.

At this point, you can use the predefined time-intelligence functions to author Sales YTD in the following way:

Click here to view code image