Post by Rob Collie
Got This Question the Other Day, and it is LONG Overdue That I Answer It
Some of You Are Confused, Some Are Nodding
Generally speaking, I think the people reading this fall into one of a few camps:
- People who are early in their Power Pivot journeys, and who also do NOT know SQL (most Excel Pros fall into this camp at some point, before hopefully moving into group #2 below).
- People who are pretty good at Power Pivot, but do NOT know SQL (I fall into this group).
- People who are good at BOTH Power Pivot and SQL (this is a blessed group).
- People who are good at SQL but still early in their Power Pivot awareness/knowledge.
Group #4 is the “target audience” for today’s post, but it’s still relevant for groups 1-3, because we WILL get asked this same question from time to time, and it’s good for us to be able to answer.
“I Started Out as a DBA…”
For Many People and Organizations, THIS is Business Intelligence
(And to a Certain Extent, This is Effective, So it Persists as a Workflow)
Let’s say you began life a a DBA. Which means you know SQL, of course, but writing SQL is not the only thing you do as a DBA. You’re maintaining indexes, watching for bottlenecks, talking about I/O, number of spindles, TempDB… all that good DBA stuff that I understand at a conceptual level but have never learned to actually DO.
But one day, someone from the Business has a question. They figure all the data required to answer it is “owned” by you, so they come to you with said question.
And hey, it turns out that you CAN write some SQL and answer the question! Which is pretty damn helpful and makes everyone involved feel pretty good. (Hey, we are all still fundamentally wired for cooperation after all). It also makes you more relevant to the front-line business, and no longer “just a cost center” from the perspective of the company’s leaders, which is VERY good for your career.
That begins your transition from “pure DBA” to “BI” professional. Jason, who asked the question, VERY much falls into this group – that is his story for sure.
So, sooner or later, you’re going to wonder the same thing Jason is wondering. “I’ve got a perfectly good tool for answering questions – SQL – so why do I need to add another in the form of Power Pivot?” No more delays, here comes my answer.
Short Version: Eyedroppers and Funnels
In short, SQL has its origins in Store and Retrieve, not Analysis. The fact that it CAN be used for analysis is testament to its flexibility and power, for sure. But it’s very hard for a single technology to be optimized for everything.
Power Pivot, as a technology, is optimized for the things SQL is not, and vice versa. Because the world already HAD relational, SQL databases, Power Pivot does not have to even TRY to be good at those things. Which gives it a LOT of “freedom” to be good at other things.
And the two of them CAN be used together, to devastating effect: In a hybrid environment, use SQL to Store, Prep, and Stage your data, then use Power Pivot to Calculate / Analyze / Explore that data.
Going One Level Geekier…
It turns out that Analysis tasks tend to require four specific things that SQL is not optimized for:
- Aggregations – this is the “funnel versus eyedropper” metaphor. Analysis rarely “cares” about individual rows, and instead is driven by aggregations of rows.
- Calculation-intensive in general – not just the SUM or AVERAGE of a rowset, but things like “percentage change vs. last year” and “what percentage of our customers come back within the first six months?”
- Ad hoc, rapidly changing queries – ask a question, get an answer, realize that your question was flawed, modify your question and ask it again. This needs to happen in seconds, not minutes or hours or days.
- Easy to read and write syntax – ideally, you want a small fraction of business users to be able to learn it – and specifically learn to use in in situations like 1-3 above (as opposed to merely fetching rowsets).
Here’s a quick comparison of the technologies, with regard to those four requirements:
Even Geekier: History Lesson on the Birth of “OLAP”
In the 1980’s, engineers started to imagine a different kind of db engine. One that would NOT have to be good at fetching individual rows, ever. One that would NOT have to allow for continuous writes. One that would essentially be read-only, and re-built from source data only infrequently. Such an engine would be free to optimize for the “aggregate and calculate” world. (I like to describe “aggregate and calculate” queries as multidimensional battleship – for 2013, West region, Product Line X, Female Customers Only, how many did we sell compared to 2012?)
The world of OLAP databases was born – OnLine Analytical Processing. Where “Online,” back then in the 80’s, meant “get your answer within seconds of asking it,” rather than “kick off the query and go home for the night.”
So, most things you can do with Power Pivot (which is an OLAP technology) are also doable with SQL. But I can pretty quickly find examples that run damn-near instantaneously in Power Pivot but take 30 minutes plus to run in SQL.
So everything above – eyedroppers and funnels, the comparison table, everything – can be understood as “Relational db’s versus OLAP db’s.”
Power Pivot Takes OLAP to a Friendlier, More Awesome Place
SQL Server Analysis Services, aka SSAS, has long been Microsoft’s OLAP product. And it’s long held more market share than any other OLAP product.
But still, the world of “write SQL queries to answer questions” persisted. Even though SSAS was/is a great product, there was a LOT more SQL expertise running around in the world than OLAP expertise, which was essentially rocket science.
In fact, the success of SQL Server Reporting Services is testament to that fact. SSRS gave “SQL people” a way to schedule their queries to run regularly, and then their results rendered as web pages so that humans could read the results.
The average Business Intelligence professional does not know OLAP. And we can attribute that to three reasons:
- Most “BI Pros” started life as DBA’s.
- The SQL approach was “working” for everyone, despite its relatively heavy shortcomings for analysis, and
- The alternative (SSAS/OLAP) was so hard to learn that no one really knew what they were missing out on.
Power Pivot is like a friendlier “skin” over SSAS. It’s not an exaggeration to say that it’s 10x as learnable as the old SSAS.
We also now have “Tabular SSAS,” which is the same exact technology as Power Pivot – you learn one and you have learned the other. (We now refer to the “old” SSAS either as “Multidimensional SSAS” or “MD SSAS” or even “MOLAP,” depending on whether you like your technologies to sound like demons from hell.)
But Power Pivot / Tabular SSAS introduces yet another advantages over traditional OLAP, by moving to an “in memory” implementation. No hard disk is touched at all during queries, wow! You just can’t do that with a read/write db like SQL, because allowing for writes makes compression very difficult, and compression is critical if you want all of your data loaded into RAM.
Faster to write queries = better questions, not just faster answers.
When I rearrange a PivotTable, I am authoring a query. When I click a slicer, I am changing my query. In about 2 seconds, I can switch from one flavor of analysis to something radically different, without having to crack open an editor. And when you combine this with execution speed (above), now my toolset responds to me about as fast as I can think.
Which then becomes not just a matter of “faster.” There’s a point at which things are just so MUCH faster, that you now engage in activities that you never would have attempted before.
You ask questions that never would have even entered your head before. You take on analysis problems that beforehand, you would have avoided (because they felt like “projects,” whereas now they feel like “a few minutes.”)
Interactivity for Humans
Lastly, you can now put your pivots/dashboards in front of other human beings, people who do NOT know Power Pivot OR SQL, and give them some ability to answer their own questions. Click that slicer, get a different answer. No expertise or long waits required. The real whiz kids can even use the field list and construct their own pivots from scratch.
Similarly, you are no longer the bottleneck for answering every single little niggling question asked by the business.
One of our largest clients is surfing this transition right now. There were/are a handful of SQL pros writing and running ungodly-complicated SQL queries in response to requests from the biz. Each query takes more than a week to get right, which is “lose/lose” for everyone.
We are changing that workflow for them, giving the business a greater degree of self service, freeing the SQL pros to do the million other things on their plates. When they do engage with the business, it is in a cooperative, highly efficient manner as opposed to the historical “wait and grind” dynamic.