- •Contents at a glance
- •Contents
- •Introduction
- •Who this book is for
- •Assumptions about you
- •Organization of this book
- •Conventions
- •About the companion content
- •Acknowledgments
- •Errata and book support
- •We want to hear from you
- •Stay in touch
- •Chapter 1. Introduction to data modeling
- •Working with a single table
- •Introducing the data model
- •Introducing star schemas
- •Understanding the importance of naming objects
- •Conclusions
- •Chapter 2. Using header/detail tables
- •Introducing header/detail
- •Aggregating values from the header
- •Flattening header/detail
- •Conclusions
- •Chapter 3. Using multiple fact tables
- •Using denormalized fact tables
- •Filtering across dimensions
- •Understanding model ambiguity
- •Using orders and invoices
- •Calculating the total invoiced for the customer
- •Calculating the number of invoices that include the given order of the given customer
- •Calculating the amount of the order, if invoiced
- •Conclusions
- •Chapter 4. Working with date and time
- •Creating a date dimension
- •Understanding automatic time dimensions
- •Automatic time grouping in Excel
- •Automatic time grouping in Power BI Desktop
- •Using multiple date dimensions
- •Handling date and time
- •Time-intelligence calculations
- •Handling fiscal calendars
- •Computing with working days
- •Working days in a single country or region
- •Working with multiple countries or regions
- •Handling special periods of the year
- •Using non-overlapping periods
- •Periods relative to today
- •Using overlapping periods
- •Working with weekly calendars
- •Conclusions
- •Chapter 5. Tracking historical attributes
- •Introducing slowly changing dimensions
- •Using slowly changing dimensions
- •Loading slowly changing dimensions
- •Fixing granularity in the dimension
- •Fixing granularity in the fact table
- •Rapidly changing dimensions
- •Choosing the right modeling technique
- •Conclusions
- •Chapter 6. Using snapshots
- •Using data that you cannot aggregate over time
- •Aggregating snapshots
- •Understanding derived snapshots
- •Understanding the transition matrix
- •Conclusions
- •Chapter 7. Analyzing date and time intervals
- •Introduction to temporal data
- •Aggregating with simple intervals
- •Intervals crossing dates
- •Modeling working shifts and time shifting
- •Analyzing active events
- •Mixing different durations
- •Conclusions
- •Chapter 8. Many-to-many relationships
- •Introducing many-to-many relationships
- •Understanding the bidirectional pattern
- •Understanding non-additivity
- •Cascading many-to-many
- •Temporal many-to-many
- •Reallocating factors and percentages
- •Materializing many-to-many
- •Using the fact tables as a bridge
- •Performance considerations
- •Conclusions
- •Chapter 9. Working with different granularity
- •Introduction to granularity
- •Relationships at different granularity
- •Analyzing budget data
- •Using DAX code to move filters
- •Filtering through relationships
- •Hiding values at the wrong granularity
- •Allocating values at a higher granularity
- •Conclusions
- •Chapter 10. Segmentation data models
- •Computing multiple-column relationships
- •Computing static segmentation
- •Using dynamic segmentation
- •Understanding the power of calculated columns: ABC analysis
- •Conclusions
- •Chapter 11. Working with multiple currencies
- •Understanding different scenarios
- •Multiple source currencies, single reporting currency
- •Single source currency, multiple reporting currencies
- •Multiple source currencies, multiple reporting currencies
- •Conclusions
- •Appendix A. Data modeling 101
- •Tables
- •Data types
- •Relationships
- •Filtering and cross-filtering
- •Different types of models
- •Star schema
- •Snowflake schema
- •Models with bridge tables
- •Measures and additivity
- •Additive measures
- •Non-additive measures
- •Semi-additive measures
- •Index
- •Code Snippets
FIGURE 8-7 The grand total of the two interest calculations is different because of many-to-many.
The version with SUMX forced the additivity by moving the sum out of the calculation. In doing so, it computes a wrong number. When handling many-to- many, you need to be aware of its nature and act accordingly.
Cascading many-to-many
As you saw in the previous section, there are different ways to handle many-to- many relationships. Once you learn them, these kinds of relationships can be easily managed. One scenario that requires slightly more attention is where you have chains of many-to-many relationships, which we call cascading many-to- many.
Let us start with an example. Using our previous model about current accounts, suppose we now want to store, for each customer, the list of categories to which the customer belongs. Every customer might belong to multiple categories, and in turn, each category is assigned to multiple customers. In other words, there is a many-to-many relationship between customers and categories.
The data model is a simple variation of the previous one. This time it includes two bridge tables: one between Accounts and Customers, and another between Customers and Categories, as shown in Figure 8-8.
FIGURE 8-8 In the cascading many-to-many patterns, there are two chained bridge tables.
You can easily make this model work by setting the relationships between Accounts and AccountsCustomers and between Customers and CustomersCategories to bidirectional. By doing so, the model becomes fully functional, and you can produce reports like the one in Figure 8-9, which shows the amount available sliced by category and customer.
FIGURE 8-9 Cascading many-to-many with bidirectional filtering is non-additive over rows and columns.
Obviously, you lose additivity over any dimension that is browsed through a many-to-many relationship. Thus, as you can easily spot, additivity is lost on both rows and columns, and numbers become harder to interpret.
If, instead of using bidirectional filtering, you use the CROSSFILTER pattern, then you need to set cross-filtering on both relationships by using the following code:
Click here to view code image
SumOfAmount :=
CALCULATE (
SUM ( Transactions[Amount] ),
CROSSFILTER ( AccountsCustomers[AccountKey], Acc CROSSFILTER ( CustomersCategories[CustomerKey],
)
If, on the other hand, you opted for the table expansion pattern, then you need to take additional care when authoring your code. In fact, the evaluation of the table filters needs to be done in the right order: from the farthest table from the fact table to the nearest one. In other words, first you need to move the filter from Categories to Customers, and only later move the filter from Customers to Accounts. Failing to follow the correct order produces wrong results. The correct pattern is as follows:
Click here to view code image
SumOfAmount := CALCULATE (
SUM ( Transactions[Amount] ),
CALCULATETABLE ( AccountsCustomers, CustomersCat
)
If you don’t pay attention to this detail, you might author the code in the following way:
Click here to view code image
SumOfAmount := CALCULATE (
SUM ( Transactions[Amount] ), AccountsCustomers, CustomersCategories
)
However, the result is wrong because the filter propagation has not been executed in the right order, as shown in Figure 8-10.
FIGURE 8-10 If you do not follow the right order, table expansion produces the wrong results.
This is one of the reasons we prefer to declare the relationship as bidirectional (if possible), so that your code will work without the need to pay attention to these details. It is very easy to write the wrong code, and this, added to the complexity of non-additivity, might be challenging to debug and check.
Before leaving the topic of cascading many-to-many, it is worth mentioning that the model with cascading many-to-many can be created most of the time with a single bridge table. In fact, in the model we have seen so far, we have two bridges: one between Categories and Customers, and one between Customers and Accounts. A good alternative is to simplify the model and build a single bridge that links the three tables, as shown in Figure 8-11.
There is nothing complex in a bridge table that links three dimensions, and the data model looks somewhat easier to analyze—at least once you get used to the shape of data models with many-to-many relationships. Moreover, a single relationship needs to be set as bidirectional. In the case of CROSSFILTER or table expansion, a single parameter is needed, again lowering the chances of errors in your code.