Intro from Rob
Many of you already know of Matt Allington and his background as BI Director for Coca-Cola Asia Pacific. Matt recently flew around the world (literally) to attend my training course with the purpose of becoming accredited by me as the official P3 Adaptive University Trainer for Australia (we even have a picture to prove it – displayed at right).
Now that Matt has my approval to teach my material, I have invited Matt to be a regular blogger on p3adaptive.com. This is Matt’s first post in this capacity, so over to you Matt.
Confessions of a DAX student
Matt here: I want to share with you a simple mistake I made early on in my DAX journey, and also create awareness of how easy it is to fall into a similar trap. This post will explain the mistake and provide the solution to how you can get filters to flow up hill – via formulae that is.
If you learnt DAX the Rob Collie way (like I did) you would be very familiar with Rob’s best practice of placing the lookup tables above the data tables in the PowerPivot Diagram View.
The reason Rob teaches it this way is because it is very easy for the reader to visualise the flow of a filtered table – filters flow down hill.
In Rob’s P3 Adaptive University Online Training and also in his first book , the message is drilled into the student/reader – Filters ONLY flow down hill from the ‘one-side’ of the relationship to the ‘many-side’ of the relationship, they can’t flow up hill. More recently I read the book by Marco Russo and Alberto Ferrari (aka The Italians) – Microsoft Excel 2013 – Building Data Models with PowerPivot. To quote The Italians:
Filter context is automatically propagated from the one side of the relationship to the many side, whereas it is not propagated from the many side to the one side.
Both of these statements are completely true, and I received the message loud and clear during my learning. But the problem was an error in the mental model I created for myself, which was “you can’t filter a lookup table from information stored in a data table”. This mental model (and hence my learning) was NOT correct. Now I am not saying for one minute that Rob (or anyone else) encouraged me to follow my mental model – I did that all myself inside my own world of ignorance. What I am saying is that this is how I interpreted the filter propagation message, no doubt in large part due to my partial understanding of DAX at the time. The real problem is that these mental models we build for ourselves stick around as we grow, so if you learn it wrong early on, it can become a limitation in your ability to advance later.
Russo and Ferrari actually go on to say the following in their book (however I didn’t read this until much later):
Note: We said that the filter context propagation happens automatically and there is no way to force this behaviour [the other direction]. This is not completely true…
Let’s look at the specific example that challenged my understanding
I discovered I had a problem with my understanding when I needed to filter a sales data table (2) based on “brochure” information in a forecast data table (1). [Note: My forecast table only contains information about promotional forecasts for each week a product is on promotion – it does not contain sales forecasts for products that are not on promotion during any given week. Hence if there is a record in the forecast table for a given week, then the product is “on promotion” by definition.]
I knew I couldn’t directly filter from 1 through 3 to 2 (because filter propagation only flows from the one side of a relationship to the many side of a relationship), and I also knew that I had to use a common lookup table above the data tables and then propagate the filter context ‘down hill’ to both tables to get the result I wanted (same one side to many side propagation rule applies). But I had burnt into my brain that you can’t filter a lookup table from information in a data table – which as I said I later learnt was not true. My early mental model had me doing lots of strange things to solve the problem, like importing multiple copies of the forecast table, one for the data table purpose and a second copy for a lookup up table purpose. It was only when I sought help with the problems associated with the multiple tables that I flushed out my Mental Model as the ACTUAL root cause of the problem.
But with CALCULATE and FILTER, You Can Do Nearly Anything
You can indeed filter based on information in the forecast data table. It is not done with filter propagation but instead it is done in a DAX formula using the universal Swiss Army Knife DAX Function – good old CALCULATE( ).
The problem was that my mental rule was preventing me from even looking for this solution using CALCULATE( ). Once I had erased my flawed mental model and replaced it with the new corrected information, the following DAX measure appeared from the Gods.
Sales from Products on Promotion:=
CALCULATE( [Total Sales]
i.e. give me the sales data for all products that have at least 1 promotional forecast record in the data table for the period. In effect I am filtering the sales data table based on information in the forecast data table – exactly what I was after. Once I have this new Measure, I can then use the column Forecast[Brochure] in my pivot tables. The filter context will BEHAVE as if it were flowing up hill from the Forecast table, through the Product table and back to the Sales table, but it only works for DAX formulae that include the correct filtering expression like this one. While it has the effect of seemingly flowing up hill, technically what is happening is that the Products Table is filtered based on the existence of a record in the forecast table. Once this filter is applied to the product table in the filter context for this measure, the filter then flows down hill from the one-side of the relationships (product table) to the many-side of the relationships (both the forecast table and the sales table).
It is probably worth explaining the extra CALCULATE in this Measure
You will notice in the Measure above that I have added an extra CALCULATE( ) function [in bold for illustration purposes] to wrap around the inner COUNTROWS(Forecast) function. I have created some debugging DAX formulae to help illustrate what is happening here.
First consider the following part of Measure without the extra CALCULATE ( )
FILTER(Products,COUNTROWS(Forecast) > 0 )
This FILTER( ) function will return a table. I can’t actually see the table, hence for debugging purposes I am going to wrap the entire Function in another COUNTROWS( ) function just for this test.
This extra outer COUNTROWS( ) function allows me to get a count of how many rows there are in the table returned from the FILTER ( ) function. As you can see below, this new Test Measure returns the same value for all coordinates in the Pivot Table.
EDIT: 10 Aug
The rows shown in the pivot table above (“Back Page”, “Brochure Page: 1…” etc) are on the many side of the relationship with the Products table and as a result their filter context will not propagate to the Product table automatically. The Test measure I have created therefore will only be filtered if there is some other new filter context introduced in the DAX measure itself. [You can see the Test measure is not filtered by the rows, and this is confirmation that there is no new filter context in the DAX].
If we analyse the Test DAX formula, the Filter ( ) function creates a new row context so it can iterate the Products table. The Filter ( ) function goes to the first product in the products table and asks “ how many rows are visible in my Forecast table when I just look at you product 1 ?”. The answer is “the same as it was before” – because nothing has changed to the filter context. Filter ( ) does not create a new corresponding filter context when it creates the row context – which is a bit confusing given the name of the function is FILTER ( ). FILTER ( ) iterates in its own little ‘row context’ world until it has completed the iteration process. Then, and only then does it apply the filter to the Products table – and even then it is due to its role as a filter input in the CALCULATE function.
When I wrap COUNTROWS(Forecast) in a CALCUATE ( ) Function, “context transition” occurs and the row context created by FILTER ( ) is transformed into a new additional equivalent filter context that gets applied during the iteration process.
In this new measure, the FILTER ( ) function iterates through each product in the Product table AND the product in the current row context is ALSO used to filter the Products table. This new filter on the Products table is then propagated to the Forecast table using the relationship in the data model (from the one side to the many side) before COUNTROWS ( ) is evaluated. Under this new condition, COUNTROWS(Forecast) will only be >0 when the product being iterated exists in the Forecast table.
Now the Brochure locations in the rows of the pivot table below (eg “Back Page” etc) will filter the Forecast table directly (because they are in the same table and do not rely on filter propagation), and the new Test DAX formula is filtering the Products table and then checking to see if the >0 test is TRUE or FALSE on each iteration. If the result is true, then we know that the product from that iteration step must exist in the Forecast Table, so that product is kept. FILTER ( ) continues until it has gone through the entire Product table, and then the last step is the outer COUNTROWS finally returns the total number of rows (ie the number of products that survived the iteration test) to the pivot table.
So taking this learning from the debugging formulae, I end up with the following Measure which includes an additional CALCULATE( ) function with the new filter context.
Sales from Products on Promotion:=
Using debugging formulae like this is a really good way to solve problems and learn more about how DAX works.
Sometimes we need help to break free from our paradigms
I wouldn’t have broken out of my flawed mental model if I wasn’t able to ask someone else what I was doing wrong – someone to share my problem with and help me realise that my mental model was flawed. One great way to do this is to participate in a community forum like https://powerpivotforum.com.au By participating you can ask questions, explain your problems and provide help to other DAX users like yourself. In addition to getting help with your own problems, in my experience helping others is a fabulous way to learn yourself. The practise of teaching is one of the best ways of deepening your own understanding, so I encourage you to get involved as both someone that ‘asks’ and also someone that ‘answers’ questions.
I hope my explanation of this problem and how I fell into this mental trap helps others to broaden their understanding of what is possible when filtering tables, help you to be aware of the risks of flawed mental models and the value of sharing your problems with others.