David Hager is At It Again
A short two weeks after his truly creative Measure Catalog Macro post, David Hager is back with a heavyweight DAX post.
This time he is exploring an area of the product that I have very little experience with, namely the EARLIER() function. On the DAX spicy scale, EARLIER() is a 5 in my opinion, but the other reason I haven’t done much with it is because I’m pretty sure it’s only useful in calc columns. At Pivotstream we do all of our calc columns in SQL (for several reasons I won’t go into here).
But not everyone has colleagues ready to write calc columns in SQL for them now do they? In fact, the Great PowerPivot Survey is reinforcing that for me (I’m going to wait until next week before I start summarizing results, but it sure has been interesting so far).
OK David, take it away.
CREATING AN Nth OCCURRENCE DAX FORMULA
By David Hager
CLICK HERE TO DOWNLOAD THE WORKBOOK IN THIS POST
When you have a growing database of customer records (such as purchases at a store), it is useful to have a method for tracking the previous visit for each customer.
A formula of the following type can be used for this.
=CALCULATE(MAX(Table1[DATE]),FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE]))
(See https://gavinrussell.wordpress.com/2010/09/23/powerpivot-dax-session-notes-2-previous-row/ for the original idea behind this post).
However, note that the previous visit date is returned for every customer visit (except the initial one) for each customer. In some cases, this may be the desired result, but usually only the last customer visit record would add the previous visit as useful information.
While trying to make a formula that would do this, I thought that it would also be nice if I could expand that concept to find an Nth previous visit. Since I did not want to hard-code an Nth value that would have to be updated manually in the PowerPivot cube, I had to invent a method for creating a dynamic constant.
In order to create a dynamic constant for PowerPivot (which would simply be an updatable cell value in Excel), the first step is to create an Excel table. The table consists of a Date field which can be used to create a relationship with the fact table Date field and a constant field. In each row of the constant field, the same formula needs to point to the update cell (=$D$2).
Now, the table can be linked to PowerPivot, using the Create Linked Table feature.
Note from Rob: Here’s another great example of a feature that I never use (Linked Table), because that linkage doesn’t refresh on the server. But it has tremendous utility in desktop PowerPivot, so I’m glad David is showing it off here.
Two more steps remain before the dynamic constant can be used in a calculated column formula. By clicking Create Relationship in the PowerPivot window Design ribbon, a relationship can be established between the Date field in Table1 with the same in Table2. Finally, a calculated column named NTH is added to Table1 with the formula =RELATED(Table2[NTH]).
Now it is time to create the formula. The entire version of this formula is too complex to be viewed in a completely displayed form, so two parts of it are entered as calculated columns and are used as inputs into the formula shown below.
=IF(Table1[IsLastRecord],CALCULATE(MAX(Table1[DATE]),FILTER(Table1,EARLIER(Table1[Customer])=
Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE] &&Table1[IsNthPreviousRecord])),BLANK())
where
Table1[IsLastRecord]
=COUNTROWS(FILTER(Table1,EARLIER(Table1[Customer])=Table1[Customer]))=
(COUNTROWS(FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE]))+1)
and
Table1[IsNthPreviousRecord]
=(COUNTROWS(FILTER(Table1,EARLIER(Table1[Customer])=Table1[Customer]))-
(COUNTROWS(FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] &&
EARLIER(Table1[DATE])>Table1[DATE]))+1)=Table1[NTH])
IsLastRecord counts the numbers of records for each customer and compares that number to the number of records previous to the last record plus 1. If the numbers are the same, the expression is true and the last record in row context for each customer will have a date depending on whether it had an Nth previous visit (otherwise a blank).
IsNthPreviousRecord is the 3rd filter in the main DAX formula. This Boolean expression is true if the numbers of records for each customer minus number of records previous to the last record plus 1 is equal to the desired Nth previous visit.
The following picture shows the returned records for the table example when Nth is equal to 2.
Note that only customers a & d have 3 or more visits, so those are the only dates returned.
I have not gone into great detail into explaining exactly how this DAX formula works mainly because if you work through this yourself you will gain a greater understanding of how to construct your own formulas. Note from Rob: Heh heh, now I STILL don’t understand EARLIER(). Nor have I built a pivot table that exploits the power of this formula. I will also leave that as an exercise to the reader.
As you might imagine, there are many ways that this formula methodology for Nth occurrence can be used. Also, the technique for using a dynamic constant is pretty neat. I hope this helps in your future PowerPivot model-building projects!
Get in touch with a P3 team member