Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Analyzing Data with Power BI and Power Pivot for Excel (Alberto Ferrari, Marco Russo) (z-lib.org).pdf
Скачиваний:
11
Добавлен:
14.08.2022
Размер:
18.87 Mб
Скачать

will not show the purchases of products sold. Instead, it will show the purchases of any product made on the dates where any of the selected products were sold, becoming even less intuitive. Bidirectional filtering is a powerful feature, but it is not an option in this case because you want finer control over the way the filtering happens.

The key to solve this scenario is to understand the flow of filtering. Let us start from the Date table and revert to the original model shown in Figure 3-5. When you filter a given year in Date, the filter is automatically propagated to both Sales and Purchases. However, because of the direction of the relationship, it does not reach Product. What you want to achieve is to calculate the products that are present in Sales and use this list of products as a further filter to Purchases. The correct formula for the measure is as follows:

Click here to view code image

PurchaseOfSoldProducts := CALCULATE (

[PurchaseAmount],

CROSSFILTER ( Sales[ProductKey], Product[Product

)

In this code, you use the CROSSFILTER function to activate the bidirectional filter between Products and Sales for only the duration of the calculation. In this way, by using standard filtering processes, Sales will filter Product, which then filters Purchases. (For more information on the CROSSFILTER function, see Appendix A, “Data modeling 101.”)

To solve this scenario, we only leveraged DAX code. We did not change the data model. Why is this relevant to data modeling? Because in this case, changing the data model was not the right option, and we wanted to highlight this. Updating the data model is generally the right way to go, but sometimes, such as in this example, you must author DAX code to solve a specific scenario. It helps to acquire the skills needed to understand when to use what. Besides, the data model in this case already consists of two star schemas, so it is very hard to build a better one.

Understanding model ambiguity

The previous section showed that setting a bidirectional filter on a relationship will not work because the model becomes ambiguous. In this section, we want to dive more into the concept of ambiguous models to better understand them and— more importantly—why they are forbidden in Tabular.

An ambiguous model is a model where there are multiple paths joining any two tables through relationships. The simplest form of ambiguity appears when you try to build multiple relationships between two tables. If you try to build a model where the same two tables are linked through multiple relationships, only one of them (by default, the first one you create) will be kept active. The other ones will be marked as inactive. Figure 3-8 shows an example of such a model. Of the three relationships shown, only one is solid (active), whereas the remaining ones are dotted (inactive).

FIGURE 3-8 You cannot keep multiple active relationships between two tables.

Why is this limitation present? The reason is straightforward: The DAX language offers multiple functionalities that work on relationships. For example, in Sales, you can reference any column of the Date table by using the RELATED function, as in the following code:

Click here to view code image

Sales[Year] = RELATED ( 'Date'[Calendar Year] )

RELATED works without you having to specify which relationship to follow. The DAX language automatically follows the only active relationship and then returns the expected year. In this case, it would be the year of the sale, because the active relationship is the one based on OrderDateKey. If you could define multiple active relationships, then you would have to specify which one of the many active relationships to use for each implementation of RELATED. A similar behavior happens with the automatic filter context propagation whenever you define a filter context by using, for example, CALCULATE.

The following example computes the sales in 2009:

Click here to view code image

Sales2009 := CALCULATE ( [Sales Amount],

'Date'[Calendar Year] = "CY 2009" )

Again, you do not specify the relationship to follow. It is implicit in the model that the active relationship is the one using OrderDateKey. (In the next chapter, you will learn how to handle multiple relationships with the Date table in an efficient way. The goal of this section is simply to help you understand why an ambiguous model is forbidden in Tabular.)

You can activate a given relationship for a specific calculation. For example, if you are interested in the sales delivered in 2009, you can compute this value by taking advantage of the USERELATIONSHIP function, as in the following code:

Click here to view code image

Shipped2009 := CALCULATE (

[Sales Amount],

'Date'[Calendar Year] = "CY 2009", USERELATIONSHIP ( 'Date'[DateKey], Sales[Deliver

)

As a general rule, keeping inactive relationships in your model is useful only when you make very limited use of them or if you need the relationship for some special calculation. A user has no way to activate a specific relationship while navigating the model with the user interface. It is the task of the data modeler, not the user, to worry about technical details like the keys used in a relationship. In advanced models, where billions of rows are present in the fact table or the calculations are very complex, the data modeler might decide to keep inactive relationships in the model to speed up certain calculations. However, such optimization techniques will not be necessary at the introductory level at which we are covering data modeling, and inactive relationships will be nearly useless.

Now, let us go back to ambiguous models. As we said, a model might be ambiguous for multiple reasons, even if all those reasons are connected to the presence of multiple paths between tables. Another example of an ambiguous model is the one depicted in Figure 3-9.

FIGURE 3-9 This model is ambiguous, too, although the reason is less evident.

In this model, there are two different age columns. One is Historical Age, which is stored in the fact table. The other is CurrentAge, which is stored in the Customer dimension. Both of these columns are used as foreign keys in the Age Ranges table, but only one of the relationships is permitted to remain active. The other relationship is deactivated. In this case, ambiguity is a bit less evident, but it is there. Imagine you built a PivotTable and sliced it by age range. Would you expect to slice it by the historical age (how old each customer was at the moment of sale) or the current age (how old each customer is today)? If both relationships were kept active, this would be ambiguous. Again, the engine refuses to let you build such a model. It forces you to solve ambiguity by either choosing which relationship to maintain as active or duplicating the table. That way, when you filter either a Current Age Ranges or a Historical Age Ranges table, you specify a unique path to filter data. The resulting model, once the Age Ranges table has been duplicated, is shown in Figure 3-10.