- •Contents at a glance
- •Contents
- •Introduction
- •Who this book is for
- •Assumptions about you
- •Organization of this book
- •Conventions
- •About the companion content
- •Acknowledgments
- •Errata and book support
- •We want to hear from you
- •Stay in touch
- •Chapter 1. Introduction to data modeling
- •Working with a single table
- •Introducing the data model
- •Introducing star schemas
- •Understanding the importance of naming objects
- •Conclusions
- •Chapter 2. Using header/detail tables
- •Introducing header/detail
- •Aggregating values from the header
- •Flattening header/detail
- •Conclusions
- •Chapter 3. Using multiple fact tables
- •Using denormalized fact tables
- •Filtering across dimensions
- •Understanding model ambiguity
- •Using orders and invoices
- •Calculating the total invoiced for the customer
- •Calculating the number of invoices that include the given order of the given customer
- •Calculating the amount of the order, if invoiced
- •Conclusions
- •Chapter 4. Working with date and time
- •Creating a date dimension
- •Understanding automatic time dimensions
- •Automatic time grouping in Excel
- •Automatic time grouping in Power BI Desktop
- •Using multiple date dimensions
- •Handling date and time
- •Time-intelligence calculations
- •Handling fiscal calendars
- •Computing with working days
- •Working days in a single country or region
- •Working with multiple countries or regions
- •Handling special periods of the year
- •Using non-overlapping periods
- •Periods relative to today
- •Using overlapping periods
- •Working with weekly calendars
- •Conclusions
- •Chapter 5. Tracking historical attributes
- •Introducing slowly changing dimensions
- •Using slowly changing dimensions
- •Loading slowly changing dimensions
- •Fixing granularity in the dimension
- •Fixing granularity in the fact table
- •Rapidly changing dimensions
- •Choosing the right modeling technique
- •Conclusions
- •Chapter 6. Using snapshots
- •Using data that you cannot aggregate over time
- •Aggregating snapshots
- •Understanding derived snapshots
- •Understanding the transition matrix
- •Conclusions
- •Chapter 7. Analyzing date and time intervals
- •Introduction to temporal data
- •Aggregating with simple intervals
- •Intervals crossing dates
- •Modeling working shifts and time shifting
- •Analyzing active events
- •Mixing different durations
- •Conclusions
- •Chapter 8. Many-to-many relationships
- •Introducing many-to-many relationships
- •Understanding the bidirectional pattern
- •Understanding non-additivity
- •Cascading many-to-many
- •Temporal many-to-many
- •Reallocating factors and percentages
- •Materializing many-to-many
- •Using the fact tables as a bridge
- •Performance considerations
- •Conclusions
- •Chapter 9. Working with different granularity
- •Introduction to granularity
- •Relationships at different granularity
- •Analyzing budget data
- •Using DAX code to move filters
- •Filtering through relationships
- •Hiding values at the wrong granularity
- •Allocating values at a higher granularity
- •Conclusions
- •Chapter 10. Segmentation data models
- •Computing multiple-column relationships
- •Computing static segmentation
- •Using dynamic segmentation
- •Understanding the power of calculated columns: ABC analysis
- •Conclusions
- •Chapter 11. Working with multiple currencies
- •Understanding different scenarios
- •Multiple source currencies, single reporting currency
- •Single source currency, multiple reporting currencies
- •Multiple source currencies, multiple reporting currencies
- •Conclusions
- •Appendix A. Data modeling 101
- •Tables
- •Data types
- •Relationships
- •Filtering and cross-filtering
- •Different types of models
- •Star schema
- •Snowflake schema
- •Models with bridge tables
- •Measures and additivity
- •Additive measures
- •Non-additive measures
- •Semi-additive measures
- •Index
- •Code Snippets
Sales YTD :=
CALCULATE (
[Sales Amount],
DATESYTD ( 'Date'[Date] )
)
Note
This is true not just for YTD calculations. All time-intelligence metrics are much easier to write when you use a date dimension.
By using a date dimension, you achieve the following:
You simplify the writing of measures.
You obtain a central place to define all columns related to the time that you will need to build reports.
You improve the performance of the queries.
You create a model that is simpler to navigate.
These are the advantages, but what about the disadvantages? In this case, there are none. Always using a time dimension, yields only advantages. Get used to creating a calendar dimension every time you build a data model, and don’t fall into the trap of choosing the easy way of using calculated columns. If you do, you will regret that decision sooner rather than later.
Understanding automatic time dimensions
In Excel 2016 and in Power BI Desktop, Microsoft has built an automated system to work with time intelligence—although the two tools use different mechanisms. We discuss both in this section.
Note
As you will learn in this section, we discourage you from using either of these systems because they do not provide the necessary flexibility and ease of use that you need in your models.
Automatic time grouping in Excel
When you use a PivotTable on an Excel data model, adding a date column to the
PivotTable prompts Excel to automatically generate a set of columns in the PivotTable to automate date calculations. For example, you might start with the model shown in Figure 4-5, where the Sales table contains only one date column, the Order Date column.
FIGURE 4-5 The Sales table contains a date column, which is Order Date, and no columns with the year and/or month.
If you create a PivotTable with Sales Amount in the values area and Order Date in the columns, you will notice a small delay. Then, surprisingly, instead of seeing the Order Date, you will see the PivotTable shown in Figure 4-6.
FIGURE 4-6 The PivotTable slices the date by year and quarter, even if you did not have those columns in the model.
To make this PivotTable slice by year, Excel automatically added some columns to the Sales table, which you can see if you reopen the data model. The result is shown in Figure 4-7, which highlights the new columns added by Excel.
FIGURE 4-7 The Sales table contains new columns, which were automatically created by Excel.
Notice that Excel did exactly what we suggested you avoid: It created a set of columns to perform the slicing directly in the table that contains the date column. If you perform the same operation on another fact table, you will obtain a new set of columns, and the two cannot be used to cross-filter the tables. Moreover, because the columns are created in the fact table, on large datasets, this takes time and space in the Excel file. You can find more information about this feature at https://blogs.office.com/2015/10/13/time-grouping-enhancements-in-excel- 2016/. This article also contains a link to a procedure that involves editing the registry to turn off this feature. Unless you work with very simple models, we recommend you follow the procedure to disable automatic time grouping and learn the correct way to handle it by hand, which we explain in this chapter.
Automatic time grouping in Power BI Desktop
Power BI Desktop tries to make time-intelligence calculations easier by automating some of the steps. Unfortunately, even if it automates the steps slightly better than Excel, Power BI Desktop is not the best solution for time intelligence.
If you use the same data model as in Figure 4-7 in Power BI Desktop and you build a matrix with the Order Date column, you obtain the visualization shown in Figure 4-8.