Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Analyzing Data with Power BI and Power Pivot for Excel (Alberto Ferrari, Marco Russo) (z-lib.org).pdf
Скачиваний:
11
Добавлен:
14.08.2022
Размер:
18.87 Mб
Скачать

and used the right way, then handling different calendars is very simple and can be achieved by simply updating your Calendar table.

If you let Power BI Desktop or Excel handle the creation of time-intelligence columns for you, then you cannot adopt this simple technique. You will be on your own in trying to figure out how to write the correct formulas.

Computing with working days

Not all days are working days. Often, you need to perform calculations that take into account this difference. For example, you might want to compute the difference between two dates expressed in working days, or you might want to compute the number of working days in a given period. In this section, we discuss the options for handling working days from the data-modeling point of view.

The first (and most important) consideration is whether a day is always a working day, or if this information might depend on other factors. For example, if you work with different countries, then it is very likely that a given day could be a working day in one country or region and a holiday in another. Thus, a day might be a working day or not, depending on the country or region. As you will see, you need more complex models for holidays depending on the country or region. It is better to start by looking at the simpler model, which is the one with holidays for a single country or region.

Working days in a single country or region

We will start with a simple data model that includes Date, Product, and Sales tables, although we will focus on Date. Our starting Date table looks like the one shown in Figure 4-23.

FIGURE 4-23 The starting point of a working-days analysis is a simple Date table.

The table contains no information about whether a day is a working day or not. For this example, let us presume there are two kinds of non-working days: weekends and holidays. If, in your country, the weekend is on Saturday and Sunday, then you can easily create a calculated column that tells you whether a day is a weekend or not, like in the following code. If the weekend is on different days, then you will need to change the following formula to make it work in your specific scenario:

Click here to view code image

'Date'[IsWorkingDay] = INT (

AND (

'Date'[Day of Week Number] <> 1, 'Date'[Day of Week Number] <> 7

)

)

We converted the Boolean condition to an integer to make it easier to sum its value and count the number of working days. In fact, the number of working days in a period is easy to obtain with a measure like the following one:

Click here to view code image

NumOfWorkingDays = SUM ( 'Date'[IsWorkingDay] )

This measure already computes a good number, as shown in Figure 4-24.

FIGURE 4-24 NumOfWorkingDays computes the number of working days for any period selected.

So far, we have accounted for Saturdays and Sundays. There are also holidays to take into account, however. For this example, we gathered the list of US federal holidays in 2009 from www.timeanddate.com. We then used the query editor in Power BI Desktop to generate the table shown in Figure 4-25.

FIGURE 4-25 The Holidays table shows a list of US federal holidays.

At this point, you have two options, depending on whether the Date column in the Holidays table is a key. If so, you can create a relationship between Date and Holidays to generate a model like the one shown in Figure 4-26.

FIGURE 4-26 Holidays can be related to the model easily if Date is a primary key.

After the relationship is set, you can modify the code for the IsWorkingDay calculated column to add a further check. This check notes that a given day is a working day if it is not Saturday or Sunday or it does not appear in the Holidays table. Observe the following code:

Click here to view code image

'Date'[IsWorkingDay] =

INT ( AND (

AND (

'Date'[Day of Week Number] <> 1, 'Date'[Day of Week Number] <> 7

),

ISBLANK ( RELATED ( Holidays[Date] ) )

)

)

This model is very similar to a star schema. It is a snowflake, and because of the small size of both the Date and the Holidays tables, performance is totally fine.

Sometimes, the Date column in the Holidays table is not a key. For example, if multiple holidays fall on the same day, you will have multiple rows in Holidays with the same date. In such cases, you must modify the relationship as a one-to- many, with Date as the target and Holidays as the source (remember, Date is definitely a primary key in the Date table), and change the code as follows:

Click here to view code image

'Date'[IsWorkingDay] = INT (

AND ( AND (

'Date'[Day of Week Number] <> 1, 'Date'[Day of Week Number] <> 7

),

ISEMPTY ( RELATEDTABLE ( Holidays ) )

)

)

The only line changed is the one that checks whether the date appears in the Holidays table. Instead of using the faster RELATED, you use RELATEDTABLE and verify its emptiness. Because we are working with a calculated column, the small degradation in performance is completely acceptable.

Working with multiple countries or regions

As you have learned, modeling holidays when you only need to manage a single country is pretty straightforward. Things become more complex if you need to handle holidays in different countries. This is because you can no longer rely on

calculated columns. In fact, depending on the country selection, you might have different values for the IsHoliday column.

If you only have a couple of countries to handle, then the simplest solution is to create two columns for IsHoliday—for example, IsHolidayChina and IsHolidayUnitedStates—and then use the correct column for various measures. If you are dealing with more than two countries, however, then this technique is no longer viable. Let us examine the scenario in its full complexity. Note that the Holidays table has different content from before, as shown in Figure 4-27. Specifically, the Holidays table contains a new column that indicates the country or region where the holiday is defined: CountryRegion. The date is no longer a key because the same date can be a holiday in different countries.

FIGURE 4-27 This Holidays table contains holidays in different countries.

The data model is a slight variation of the previous model, as shown in Figure 4-28. The main difference is that the relationship between Date and Holidays is now in the opposite direction.

FIGURE 4-28 The data model with different countries looks similar to the model with a single country.

The problem with multiple countries is that you need to better understand the meaning of the numbers to produce. The simple question of “how many working days are in January?” no longer has a clear meaning. In fact, unless you specify a country, the number of working days cannot be computed anymore.

To better understand the issue, consider Figure 4-29. The measure in the report is just a COUNTROWS of the Holidays table, so it computes the number of holidays in each country.

FIGURE 4-29 The figure shows the number of holidays per country and month.

The numbers are correct for each given country, but at the total-per-month level, they are just a sum of the individual cells; the total does not consider that one day might be a holiday in one country and not a holiday in another. In February, for example, there is a single holiday in the United States, but no holidays in either China or Germany. Thus, what is the total number of holidays in February? The question, posed in this way, makes little sense if you are interested in comparing holidays with working days, for example. In fact, the cumulative total number of holidays for all countries isn’t helpful at all. The answer strongly depends on the country you are analyzing.

At this point in the definition of the model, you need to better clarify the meaning of whether a day is a working day or not. Before your computation, you can check whether a single country has been selected in the report by using the IF ( HASONEVALUE () ) pattern of DAX.

There is another point to observe before reaching the final formula. You might want to compute the number of working days by subtracting the number of holidays (retrieved from the Holidays table) from the total number of days. In doing so, however, you are not taking into account Saturdays and Sundays. Moreover, if a holiday happens to be on a weekend, then you do not need to take it into account, either. You can solve this problem by using the bidirectional filtering pattern and counting the dates that are neither Saturday nor Sunday and that do not appear in the Holidays table. Thus, the formula would be as follows:

Click here to view code image

NumOfWorkingDays := IF (

OR (

HASONEVALUE ( Holidays[CountryRegion] ), ISEMPTY ( Holidays )

),

CALCULATE (

COUNTROWS ( 'Date' ), AND (

'Date'[Day of Week Number] <> 1, 'Date'[Day of Week Number] <> 7

),

EXCEPT ( VALUES ( 'Date'[Date] ), VALUES ( H

)

)

There are two interesting points in this formula, which are highlighted with a bold font. Following is an explanation of both:

You need to check that there is only a single value for CountryRegion to protect the measure from showing numbers when multiple countries or regions are selected. At the same time,

you need to check if the Holidays table is empty, because for months with no holidays, the CountryRegion column will have zero values and

HASONEVALUE will return False.

As a filter for CALCULATE, you can use the EXCEPT function to retrieve the dates that are not holidays. This set will be put in a logical AND with the set of days that are not in the weekend, producing the final correct result.

Still, the model is not yet perfect. In fact, we are assuming that weekends always happen during Saturday and Sunday, but there are several countries and regions where the weekend falls on different days. If you need to take this into account, then you must make the model slightly more complex. You will need another table that contains the weekdays that are to be considered part of the weekend on a country-by-country basis. Because you have two different tables that need to be filtered by country, you will need to transform the country into a dimension by itself. The complete model is shown in Figure 4-30.

FIGURE 4-30 The complete model contains a dedicated table for weekends and a Country/Regions dimension.

The code is indeed slightly simpler, although it may be a bit harder to read, as shown in the following:

Click here to view code image

NumOfWorkingDays := IF (

HASONEVALUE ( CountryRegions[CountryRegion] ), CALCULATE (

COUNTROWS ( 'Date' ), EXCEPT (

VALUES ( 'Date'[Day of Week Number] ),

VALUES ( Weekends[Day of Week Number] )

),

EXCEPT ( VALUES ( 'Date'[Date] ), VALUES ( H

)

)