Forum

Power query loading...
 
Notifications
Clear all

Power query loading multiple pdf files from a folder

6 Posts
4 Users
0 Reactions
415 Views
 Gary
(@garrus)
Posts: 8
Active Member
Topic starter
 

Has anyone worked out how to combine and transform multiple pdf files in a folder? The issue is that pdfs have many columns and an inconsistency of data layout and thus the sample file upload can’t handle differing layouts. Any ideas out there please?

 
Posted : 12/09/2022 6:58 pm
(@jstewart)
Posts: 216
Estimable Member
 

What do you mean by differing layouts? Like different column headers names? A different number of columns? Different information? Could it be something as simple as a mapping table to correct differing information to make it the same?

 
Posted : 13/09/2022 1:00 pm
 Gary
(@garrus)
Posts: 8
Active Member
Topic starter
 

A different number of columns. System generated PDFs create inconsistent results which makes it very difficult to aggregate many pdfs from within a folder.

 
Posted : 13/09/2022 7:52 pm
(@jycccwjc)
Posts: 64
Estimable Member
 

I have done this before with bank statements. Without reviewing files, it is hard to say how to do it as it involves various steps.

 
Posted : 13/09/2022 8:45 pm
 Gary
(@garrus)
Posts: 8
Active Member
Topic starter
 

Jim - do you know of any books / websites that could help here? PQ works pretty well for individual pdfs but gets too complicated when there are multiple pdfs in a folder. I haven't been able to find relevant guidance on the web.

 
Posted : 14/09/2022 4:46 pm
Philip Treacy
(@philipt)
Posts: 1629
Member Admin
 

Hi Gary,

The issue isn't that there are multiple PDF's, it's the structure in the PDF's differs.  You'd have the same issue with workbooks, tables, CSV files etc if they contained different numbers of columns.  PQ isn't designed to combine data that has an inconsistent structure like this.

You said yourself the PDF's present data inconsistently, the fix to this is to get the data in those PDF's structured in the same way.  Same number of columns, same column names.

regards

Phil

 
Posted : 15/09/2022 2:13 am
Share: