Forum

How to create a sin...
 
Notifications
Clear all

How to create a single table from 100's of pdf's?

3 Posts
2 Users
0 Reactions
111 Views
(@wab)
Posts: 4
Active Member
Topic starter
 

Hi,

I have 100's of pdf's like the one attached. I would like to be able to import them all and reconfigure into a table like the one I have also attached. I have tried it with PowerQuery but I don't know enough to make it work.

I am running Office 365 on Windows 11.

Your assistance would be greatly appreciated.

Wayne

 
Posted : 13/06/2023 10:37 am
(@catalinb)
Posts: 1937
Member Admin
 

Hi Wayne,

You should use Get Data> From File> From Folder, then you can filter for a specific file extension-.pdf in your case.

Example attached, change the folder path from Path sheet to your pdf folder path.

sample queries:

let
FolderPath = Excel.CurrentWorkbook(){[Name="FolderPath"]}[Content]{0}[Folder Path],
Source = Folder.Files(FolderPath),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".pdf")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each GetPDFData([Content])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
ColNames = Table.ColumnNames(Table.Combine(#"Removed Other Columns"[Custom])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", ColNames, ColNames)
in
#"Expanded Custom"

Code for the function that reads data from a single pdf file GetPDFData:

(bin as binary)=>
let
#"Imported PDF" = Pdf.Tables(bin),
#"Filtered Rows1" = Table.SelectRows(#"Imported PDF", each ([Kind] = "Table")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows1",{"Data"}),
ColNames = Table.ColumnNames(Table.Combine(#"Removed Other Columns"[Data])),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", ColNames, ColNames)
in
#"Expanded Data"

 
Posted : 16/06/2023 3:02 am
(@wab)
Posts: 4
Active Member
Topic starter
 

Thank you, Catalin.  I will let you know if it works as soon as I get a chance to try it.

 
Posted : 16/06/2023 10:01 am
Share: