Extracting Data from Nested Lists and Records in Power Query

Philip Treacy

April 7, 2022

You may have data that is contained in lists or records, or even lists inside records, or records inside lists. I've seen nested data like this come from API's or Sharepoint amongst other sources.

In such cases you may want to extract only certain bits of data from the list(s) or record(s) and this blog post shows you how to do just that.

Watch the Video

Subscribe YouTube


Download Example Excel Workbook

Enter your email address below to download the sample workbook.

By submitting your email address you agree that we can email you our Excel newsletter.

This Works In Power BI and in Excel

The M code shown here can be used in Power Query in both Power BI and Excel.

Records in Lists

I'm starting with a table like this in Power Query.

original data table

The first column contains the names of locations where my business has ongoing projects. The second column in intended to show the name of the project manager for each location.

As you can see at the moment some of the rows in the second column contain lists and some rows are empty.

In other situations you might want to filter out the blank rows, but in this case I want to keep all rows. I want to know where a location does not have a project manager.

By previewing the contents of the lists (click beside the word List), I can see that the first list contains a record

preview list

the second list is empty

preview empty list

and the third list (in row 4) also contains a record.

preview third list

The data I want is in the records so I need to check those records to see how they are structured. First, let's duplicate the query so I don't make any unintended changes to my main query. Right click on the query and then click Duplicate

duplicate query

In the duplicated query, click on the word List in row 1 of the Data column, which gives me this

record from list

Clicking beside the word Record gives a preview of the data in the record, which is this

preview record from data

The record contains 3 fields : ID, Name and Email. I want to get the Name.

Back in the main query, I need to add a new Custom Column : from the Ribbon -> Add Column -> Custom Column

What I need to do is to access the record from the list in the data column. I can do this by simply referring to the item(s) in the lists using index numbers (like an array).

Lists are indexed from 0 and these lists only contain 1 item so to access the record in position 0 I use this code

access record from list

This will retrieve the record, but I need to access the data in that record. To do this I use the Record.Field function.

Record.Field takes 2 parameters, the record and the name of the field to get data from. I want to get the project manager's name from the Name field so my custom column code now looks like this

extract data from record using record.field

If I click OK to create the new column I'll get this

data extracted from records

The errors in rows 3 and 5 are because I'm trying to get data from lists and records that don't exist in the corresponding row of the Data column. The error in row 2 is because that list is empty.

The fix is easy. Edit the step that created the custom column (click on the gear/cog beside the step name)

edit query step

Then wrap the code in try .. otherwise.

wrap code in try otherwise

Using try .. otherwise catches any errors generated by the code and allows me to specify a default empty string "" as the result when an error does occur.

Lists in Records

I'm starting with this table where the data column contains records, and in these records are lists. As in the first example, the list inside the record for Brisbane, is empty.

lists in records

The list can be extracted from the record by adding a new Custom Column and using this code

extract list from record

Which gives this new column called List Data. Previewing the list in row 1 I can see that there are 3 items in the list.

preview of list data

To extract the project manager's name I can use the index for that item, which is 1 as lists are indexed from 0. I can create a new custom column with this code

extract data from list

Resulting in this table

name extracted from data

However these two steps can be combined into one. Taking the code used to extract the list from the record, add {1} after the call to Record.Field to access the project manager name.

combine steps into one

column from combined steps

Lists & Records

If you have data in a column that is a combination of lists and records

lists and records in column

and the data inside the lists and records is structured as in my first two examples

data in list

data in record

Then I can use the Value.Is function to check if the row has a list or a record, then extract the data in the appropriate way.

To use Value.Is you pass into it the data structure/value you want to test, and the type you want to check against. So let's add a new custom column and check for lists with this code, and if the value in the Data column is a list, extract the Name from position 1 in that list

check if value is list

To check for records I can add an else if clause and if a record is encountered, use Record.Field to get the project manager's name from the Name field.

check for lists and records

Giving this table with the names extracted from both the lists and the records.

final table

If neither a list nor a record is found, then an empty string is produced. I haven't used try .. otherwise here as I know the structures of the lists and records won't cause an error, but you can use try .. otherwise if you think you might get errors.

20 thoughts on “Extracting Data from Nested Lists and Records in Power Query”

  1. Hi Phil,

    Thank you so much for this article. It has been incredibly helpful. But my data is one layer more complex. I have a column which are values of type List. Each List has one or two values of type Record. But the records may be in any order. So I am not able to pull the value I need using Record.Field( [fieldName]{0} ) as the field I need may be index {1}. How do I select a specific Record from the list based on one of the record’s field values? So my data looks something like below. And I only want to select the Records where fieldA = “Region” and return the value of fieldB (which is actually a List type, but that should be easily done if I can select the right record).

    LIST
    Record(0)
    fieldA = “Region”
    fieldB = “EU”
    Record(1)
    fieldA = “Grade”
    fieldB = “A”
    LIST
    Record(0)
    fieldA = “Grade”
    fieldB = “B”
    Record(1)
    fieldA = “Region”
    fieldB = “US”

    I hope I explained this clearly. Thanks!

    Reply
    • Hi Alex,

      If your column is a bunch of lists that contain records, expand the lists to new rows so you’ll get each record in its own row.

      You can then add a custom column with this code

      = if Record.Field([Data], “fieldA”) = “Region” then Record.Field([Data], “fieldB”) else null

      then remove any rows with null. This should leave you with the data you want extracted.

      Download this example file to see how I did it.

      Regards

      Phil

      Reply
      • Thank you!!!! This was driving me crazy as I didn’t want to expand the List values as it created multiple rows for the existing records. The key was “remove any rows with null”! I tip/trick I hadn’t learned yet. Thanks so much for your help Phil!

        Alex

        Reply
  2. Thank you for this step-by-step guidance. It truly helped me to sort out my issue and understand more the background of MO tools! Powerful tools! Thanks a lot!

    Reply
  3. Hi Mynda,
    Thanks for the presentation on extracting List by Phillip, I was trying to download the ample file but could no did not see the rightfile attache on “Extracting Data from Nested Lists and Records in Power Query”.
    Please send me the right file.
    Kind regards
    Michael

    Reply
    • Hi Michael,

      The queries are in the file. You just need to go to the Data tab > Queries & Connections to see the Queries and Connections pane with the queries listed.

      Mynda

      Reply
  4. Hi,

    Fantastic article.

    I have a similar problem only I need to have a running total against a customer order and by payment date…

    CustomerID PaymentDate Amount RT
    0001 01/01/2022 £10 £10
    0001 02/02/2022 £10 £20
    0001 03/02/2022 £10 £30
    0002 02/02/2022 £20 £20

    Can you help?

    Reply
  5. HI MYNDA,

    THE DOWNLOADED FILE DOESN’T CONTAIN THE EXAMPLE DATA
    FOR THIS CASE: ‘Extracting Data from Nested Lists and Records in Power Query’

    BEST REGARDS FROM MEXICO
    MIGUEL PONCE

    Reply
    • Hi Miguel,

      There are no separate source data files, the data is created in the first 2 steps. Click the cog/gear icon beside the Source step to see the table used to create the first column. You can change this table if you like. This step was created in Power BI using ‘Enter Data’ in the Ribbon, then I copied it over to Excel. The 2nd step creates the 2nd column.

      Regards

      Phil

      Reply
  6. I use the articles of this site a lot and I translate them into Persian for my compatriots. Thank you to the administrators of the site and all the users who participate in answering the questions.

    Reply
  7. I’d love to try the tutorial, but cannot find the json file that serves as a source, is there a link for that

    Reply
    • Hi Jorge,

      The sample source data is created in the first coupe of steps of the queries. I’ve hand coded it so there are no separate source files to worry about.

      Regards

      Phil

      Reply
  8. Your tutorials on PQ List. functions have been very helpful. Thank you. — Tutorial suggestion: expand a (multi-level) Bill of Materials using List.Generate. — I’m struggling mightily to wrap my head around this. My goal is to construct a table of multi-level BOM records from (1) a top table of Assembly part numbers & quantities & (2) a BOM table (of Assemblies, Components, & quantities) in an Access back-end database. (Assemblies call out other Assemblies &/or Components.) The processing logic would be (1) start with the top table, (2) extract BOM records from Access where Assembly part numbers match the top list, (3) read Component part numbers in the extracted BOM records, extract BOM records from Access where BOM Assembly matches Iteration Component, and (4) iterate/loop thru that logic until there are no extracted Components that match BOM Assemblies. My understanding (perhaps incorrect) is that List.Generate is the most performant method to execute loops. I’m also exploring List.Generate because of its ability to return lists/records/tables(?) which could be combined into a single output table when it finishes looping.

    Reply
    • Thank you Jim.

      Could you please start a topic on the forum and attach an example of a BOM you need to work with and an example of the final result you are looking for.

      Regards

      Phil

      Reply

Leave a Comment

Current ye@r *