Easily Compare Multiple Tables in Power Query

Philip Treacy

August 5, 2021

Comparing table columns in Excel is a common task. You may need to identify items that are the same, different, or missing from these columns.

In Power Query, table columns are lists and you can compare these lists using table merges. But merging can only be done on two tables at a time. If you need to compare three or more table columns then using List Functions is the way to do it.

It does require some manual M coding and if the thought of that puts you off, it's not hard. And if you never get your hands dirty with some coding, you're going to miss out on the real power of Power Query.

The three examples I'll show you contain only two lines of code each and do what can take multiple lines using table merges or appends.

Watch the Video

Subscribe YouTube

 

Download Sample Excel Workbook

Enter your email address below to download the sample workbook.

By submitting your email address you agree that we can email you our Excel newsletter.

 

Source Data Tables

I've got four tables. The first contains the names of some imaginary staff members and an ID code assigned to them.

source data tables

The other three tables show the names of staff that attended three different days of training that were arranged for them.

more source data

My tasks are to work out what staff

  • Attended every day of training
  • Attended at least one day
  • Attended no days

I've already loaded three of these tables into PQ so let's load the last one.

Click into the table, Data -> From Table/Range.

get data from table

Then under Close and Load click on the black arrow, Close & Load to and choose Only Create Connection.

close and load

connection only

Who Attended Every Day of Training

I have to find out who appears in every table for Training Days 1, 2 and 3.

If you were using table merges then you'd do an Inner Join on the Training_1 table and the Training_2 table to return only matching rows.

inner join

You then have to do another Inner Join on the result of the first join and the Training_3 table.

If you are eagle eyed you may have noticed that in the Training_2 table the name agueda jonson is lower case. So both of these joins have to be Fuzzy Joins and set to ignore case so that you can do case insensitive comparisons.

You can do all of these steps in 1 line using the List.Intersect function.

First I need to create a new blank query. Right click on the Queries area of the editor and then New Query -> Other Sources -> Blank Query

Name the query Attended All Days.

create new blank power query

If you remember your set theory, Intersect returns whatever is common between sets, or in this case, our Name columns.

Remember that table columns are lists which is why you can use list functions to do this.

Click the Attended All Days query and under Applied Steps click on Source.

select source step

Now in the formula bar type this, and then press Enter.

= List.Intersect( { Training_1[Name], Training_2[Name], Training_3[Name] } , Comparer.OrdinalIgnoreCase )

list intersect

The result is this list of names

list intersect result

What List.Intersect is Doing

I'm passing in two arguments to the function. The first are the columns I want to compare and these are enclosed in { }

The columns are of course the Name columns from the three tables of attendees at each days' training.

The second argument Comparer.OrdinalIgnoreCase tells the function to ignore case in text comparisons. So agueda jonson will look the same as Agueda Jonson.

Filtering the Staff Table

With the list of names of those who attended all days of training, I can use that to filter the Staff table and return all rows associated with those names.

Right-click on the Source step and select Insert Step After. With the new step selected press F2 and rename it to Tab, which is my abbreviation for Table. You can call it what you like really.

insert step after source

With the new step still selected, type this into the formula bar

= Table.SelectRows( Staff , each List.ContainsAny( {[Name]} , Source ) )

So what's happening here?

explanation of table select function

This line of code is doing the following

  1. Select rows from the table
  2. called Staff
  3. where each row
  4. in the Name column
  5. appears in the list Source

If you filter rows by using the menus, Power Query uses the same function. I'm just manually calling it here to do what I want.

The result is the Staff table filtered to just the rows showing those people who attended all days of training.

filtered table

Who Attended At Least 1 Day of Training

This involves checking every day and seeing who was there at least once.

You could do this by combining the three tables (appending the queries) and then removing any duplicates.

Or you could just use the List.Union function

= List.Union( { Training_1[Name], Training_2[Name], Training_3[Name] } , Comparer.OrdinalIgnoreCase )

Using Set Theory again, a Union of the Name columns from the Training_1, Training_2 and Training_3 tables, using case insensitive text comparison because of Comparer.OrdinalIgnoreCase, gives a list of those who attended at least one day.

list union function

If you want to, you can then filter the Staff table exactly the same way as before using the Table.SelectRows function.

list union result

Who Attended No Training Days

To work this out you need to start with the full list of staff and then remove the names of those who attended on each of the 3 days.

Using table merges you can do this with Left Anti Joins, but you need to do 3 separate joins to get the final list.

Or you can do it with in one line with List.Difference

= List.Difference( Staff[Name], List.Union( { Training_1[Name], Training_2[Name], Training_3[Name] } , Comparer.OrdinalIgnoreCase ) )

list difference function

Filtering the Staff table using these names gives

list difference result

Tip - Quick Sanity Check

The number of people who attended at least 1 day (16) added to the number of people who attended no days (3) should equal the total number of staff (19).

Conclusion

I've shown you how to use List functions to compare lists and filter tables. In the sample file I've created I've also included queries that do the same job but use table merges and appends. You can have a look for yourself at the different methods.

If you're not sure about learning M functions like List.Intersect I'd encourage you to read through the Microsoft function definitions to become familiar with them and try them out.

Although I do have a programming background, I didn't wake up one day and know these M functions. I had to put time in to learn what functions existed and how they worked.

When you first started using Excel did you know how to use VLOOKUP or SUMIF? No, you had to learn them. So if you're at all hesitant about learning M functions, just dive in and get started.

9 thoughts on “Easily Compare Multiple Tables in Power Query”

  1. In the output table “attended all courses” the Map Code can remove additional members eg if you make Agueda Jonson lower case in Staffing, it is visible in the Source but then removed in the Map Output. Not figured out how or why yet to resolve/understand, assume it needs “Comparer.OrdinalIgnoreCase” adding into Map?.

    Thanks for this really informative and precise.

    Reply
      • Apologies Phil, I meant “Tab”, not map in the Attended All Events Query.

        The Tab code shouldn’t but excludes Agueda Jonson when I make it lower case in the Staff table? “= Table.SelectRows(Staff, each List.ContainsAny ({[Name]}, Source ))”. I don’t understand why and haven’t been able to resolve?

        Reply
        • Hi Rob,

          The Source step creates this list by intersecting the 3 Training tables

          = List.Intersect( { Training_1[Name], Training_2[Name], Training_3[Name] } , Comparer.OrdinalIgnoreCase )

          list intersection

          This is a case insensitive intersection.

          The Tab step then use Table.SelectRows to get the rows in the Staff table where the Name is in this list (created by List.Intersect above)

          agueda jonson isn’t in the Staff table (it’s a case sensitive select) so that row isn’t selected.

          You can select agueda jonson by using Text.Proper around [Name] like this

          = Table.SelectRows( Staff , each List.ContainsAny( {Text.Proper([Name])} , Source ) )

          but you will end up with lower case agueda jonson in your table. That can be fixed by right clicking on the column then Transform -> Capitalize Each Word.

          Regards

          Phil

          Reply
  2. This is EXCELLENT!

    Is there a way to compare two tables and find all differences between each in one table?

    In my example I have a listing of employees for June 2021 and July 2021. Is there a way to extract all differences between the two months? So if an employee changed departments from June to July as well as finding employees who were hired in July?

    Reply
    • I expect you’d have to do this as two separate queries. One to compare changes in existing employee records and one to find new employees.

      Reply
  3. Hi Phil,
    This was really good. Thanks!
    Could you do more of this?
    I use quite some patterns with List.xyz. There are so many List functions and they are so powerful. You have selected really nice examples and it would be good to see more from you about other List functions you regard as especially useful.

    Your third case is great for the introduction of the important List.Difference. I love M because instead you could also copy the second case and just filter by each not.

    Thanks again,
    Matthias

    Reply

Leave a Comment

Current ye@r *