Forum

Combine files in fo...
 
Notifications
Clear all

Combine files in folder painfully slow

4 Posts
3 Users
0 Reactions
95 Views
(@baxbax)
Posts: 125
Estimable Member
Topic starter
 

Hi,

There have been lots of posts in the forum about this so apologies for going over old ground but I have pretty much tried most of the suggestions and my refresh is interminably slow.

I am trying to combine the data from the Excel files in a folder. The files are .xlsb files. There are 10 files in the folder. Each file has one sheet/data range. Each sheet has approx 100k lines. The layout of the files is exactly the same. There is not a lot of additional processing done. I think some of the field types are changed. The way is is done at the moment uses a sample query.

It is currently taking about an hour to return the data to an Excel sheet.

Any suggestions?

Thanks

 

Bax

 
Posted : 16/03/2022 10:32 am
(@mynda)
Posts: 4761
Member Admin
 

Hi Bax,

The quickest file type to import is a .csv or .txt file. Seeing that you only have one sheet in each file, why not change the file types to text or csv?

Mynda

 
Posted : 16/03/2022 9:57 pm
(@adam-bender)
Posts: 5
Active Member
 

If you're using the "Combine Files" default function in PQ it's going to be slow and inflexible.

I prefer to add a custom column that extracts the binary data into a cell, which can then be expanded.

 

Take a look at the steps in blue below.  "Excel.Workbook([Content]) is the part that takes data from a Folder.Files list and expands that data within a cell.

=============================================

 

let
    Source = Folder.Files(input_local_folder),
    #"Filtered Rows" = Table.SelectRows(Source, each [Extension] = ".xlsx"),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content"}),
    #"Added Binary Data Column" = Table.AddColumn(#"Removed Other Columns", "binaryData", each Excel.Workbook([Content])),
    #"Removed Other Columns1" = Table.SelectColumns(#"Added Binary Data Column",{"binaryData"}),
    #"Expanded binaryData" = Table.ExpandTableColumn(#"Removed Other Columns1", "binaryData", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name", "Data", "Item", "Kind", "Hidden"}),
    .
    .
    .
    .

in
    #"Filtered Rows2"

 
Posted : 14/06/2022 10:36 am
(@baxbax)
Posts: 125
Estimable Member
Topic starter
 

Hi Adam,

Thanks for the response. I will give this a try and come back with the results.

 

Regards

 

Bax

 
Posted : 18/06/2022 4:49 am
Share: