From the Water Cooler

As a ‘water cooler’ of sorts for this community, we meet some amazing people. Doug Burke shares with us how easy Power Query makes it to do powerful cleaning with a click of the mouse.

The Problem

If you’re like me, you get crappy data from your source systems.  Not that the data is inaccurate. No. It’s just that the data does not fit your view of the world. It doesn’t suit your needs right out of the box. That makes it crappy.

You need to clean it to make it usable. You need the ‘power’ of Power Query to turn night into day.

Did you know that Power Query’s most potent data cleanups are just a right-click away? Let me share the secret.

It’s a simple three-step process:

1. Import data into Power Query

2. Right-click a column

3. Select a data cleaning function

Let’s take a look . . .

right click

Try it yourself. Import some data. Right-click a column. Look at all those functions.

I count 42 data cleaning possibilities just from what’s visible. That doesn’t even include the additional transformation powers of ‘Group By,’ ‘Add Column From Examples’ and ‘Add as New Query.’

A whole lot of something is going on here.

Let’s start with a simple example. We’ll use right-click-column to fix inconsistent customer names.

• Customer XYZ sometimes is listed as XY&Z. We need to remove the ‘&’ symbol.

remove charachter

 

Select customer column

 

Replace Values

Your first reaction may be ‘So what. I can do the same thing with Excel’s Search and Replace’. True, but you would miss out on Power Query’s creating a series of ‘Applied Steps’ that are consistent, repeatable and set you up for more data transformations within the Power Query Editor.

Another example, this time using two columns. Find the unique instances of products bought by customers. It’s not as simple as it sounds because different customers purchase different products. You want to quickly see the unique combinations of which customers bought what products and exclude the duplicates.

Distinct List

 

Remove Other Columns

 

Remove Duplicates

A quick review of using right-click column data cleanups:

  • Consistent
  • Repeatable
  • Sets you up for more data transformations within the Power Query Editor

How about one more example to drive the point home? This time, let’s fill in missing customer names. This is a frequent problem if your source data is an existing formatted report which excludes repeating values. Power Query can handle it in a snap.

Before/After Fill down

 

Fill down

 

Filled table

‘Right-click-column’ is a fast, convenient way to get the job done. No need to review the Power Query ribbons for column cleanups functions. There are so many readily-available combinations that it boggles my mind.

Bonus: when done, you have a sequenced workflow (know as ‘Applied Steps’) that can be repeated the next time you get data.

Simple, right? That’s the point – – Power Query makes it simple. Right-click-column makes it simpler! Win-win!!

I am not saying this is the only way to clean up data. Or even the best way. But I am saying it’s a quick way. And sometimes its better to be quick than good.

Full disclosure: I’m using Excel 2013 Professional Plus 64-bit (2.49.4831.381). Your mileage may vary depending on your version of Excel and Power Query.

Microsoft’s platform is the world’s most fluid & powerful data toolset.  Get the most out of it.

No one knows the Power BI ecosystem better than the folks who started the whole thing – us.

Let us guide your organization through the change required to become a data-oriented culture while squeezing every last drop* of value out of your Microsoft platform investment.

* – we reserve the right to substitute Olympic-sized swimming pools of value in place of “drops” at our discretion.