In data warehouses across many industries, it is common to have multiple foreign keys on a fact table that relate back to a single dimension, such as Date, State or Person. Let’s use Date as an example and look at the Order fact for a company. It is realistic to have multiple dimension keys in the fact, such as the following fields.
- Order Date Key – The date the order was submitted
- Pick Date Key – The date the order was fulfilled
When modeling the data structure in a Microsoft tool, such as SQL Server Analysis Services or directly in Power BI, tables/views can be quickly added. Relating two tables is as easy as clicking on a field and dragging it to the table to which it is related. In the model above based on the WideWorldImportersDWsample database from Microsoft, seven relationships have been defined from the Order table, one relationship each to City, Customer and Stock Item, and two relationships each to Date and Employee. The next step is to begin creating measures. To calculate a total sales amount, the DAX syntax looks like this.
[Total Sales] := SUM(‘Order’[Total Including Tax])
Wonderful! This measure can be incorporated into a user’s report or dashboard. Below is a screenshot of how the data appears in Excel, grouping Total Sales by Calendar Year.
As an end user of the tabular model, knowing the company tracks both the date the order was placed and the date the order was picked, a couple questions might come to mind.
- When grouping by date, which date is being used?
- How can I see total amounts by order date, as well as by picked date?
To answer the first question, the end user will not know which date is being used. The developer can determine this by the active relationship. In the model’s design, this is noted by the solid line between the Order and Date tables, as shown to the right. Double-clicking on the active relationship, the Edit Relationship dialog shows Order Date Key has this relationship with the Date dimension. Any time fields from the Date dimension are used for grouping or filtering, the dates will correspond to order dates, not picked dates. There is a relationship to the Date dimension on Picked Date Key, but this relationship is inactive and will not be used, unless the calculation explicitly calls the inactive relationship.While Order Date Key has the active relationship with Date, this can be changed by the developer. Doing so, of course, has consequences, as now all grouping or filtering on Date would be based on Picked Date Key and not Order Date Key. If there is a typical delay between those two dates, that can result in a bit of difference in the reported numbers.
To lock in the relationship being used, the Total Sales measure should be re-written to explicitly define the relationship to the Date dimension. The measure, Total Sales, whose formula we defined as
[Total Sales] := sum(‘Order’[Total Including Tax])
could be renamed to Total Sales by Order Date or Total Sales by Picked Date. The key to these measures is the use of the CALCULATE() function along with the USERELATIONSHIP() function to calculate the total sales. Total Sales by Order Date can be defined with the following formula.
[Total Sales by Order Date] := CALCULATE( SUM(‘Order’[Total Including Tax]), USERELATIONSHIP(‘Order’[Order Date Key], ‘Date’[Date]) )
Total Sales by Picked Date is nearly identical, except USERELATIONSHIP() uses Picked Date Key, instead.
[Total Sales by Picked Date] := CALCULATE( SUM(‘Order’[Total Including Tax]), USERELATIONSHIP(‘Order’[Picked Date Key], ‘Date’[Date]) )
Now, when the users group either measure by date, they know which date is being used in the grouping. This solution also answers the second question, how to view order totals by order date and also by picked date. Both measures can be added to the report and grouped by the same Date dimension, as shown below.
As you can see from the results, there are orders booked in one year, but the items weren’t picked from inventory until the next year (late December orders, one might guess). With the unknown year, the blank value after 2016, for Picked sales, one can gather a number of orders hadn’t been picked yet, as of the last time the data warehouse was processed. Either way you look at these two measures, they do come to the same total, providing data consistency.
Being able to show data in the same report by different dimension attributes is one of the major benefits of a data warehouse. Building the calculations into the tabular model takes a little extra work and know how, but the results will be greatly appreciated by the users.