Forum

Get Data from the f...
 
Notifications
Clear all

Get Data from the folder with XMLfiles

9 Posts
2 Users
0 Reactions
100 Views
(@ayemu7)
Posts: 197
Reputable Member
Topic starter
 

Hi Team,

I tried to combine and load the folder of XML files to Excel then clean in power query.

While some files can be expending properly, some other files behave to have a different columns.

All those files are exact same format.

Please check and let me know.

Please see attached for reference.

In the file, agedCareServiceNumber and agedCareServiceNumber.1 should be in the column.

We should not have ".1" columns for Name, claimMonth etc.

Name.1 column should not be there too.

Please let me know what would be the best way to bring those columns in one while uploading XML files.

Thank you.

Regards,

Aye

 
Posted : 19/04/2020 4:36 am
(@catalinb)
Posts: 1937
Member Admin
 

Hi Aye,

Without seeing how are you processing the files, it's impossible to provide an advice.

Upload your query.

 
Posted : 19/04/2020 12:43 pm
(@ayemu7)
Posts: 197
Reputable Member
Topic starter
 

Thanks for reply, Catalin.

The original XML have sensitive information and I am not sure how to delete those information on XML and save them.

Is there any thing I can download which can modify XML file?

Thank you.

Regards,

Aye

 
Posted : 19/04/2020 8:29 pm
(@catalinb)
Posts: 1937
Member Admin
 

the xml is a text file, any text editor will open it.

Catalin Bombea said
Hi Aye,

Without seeing how are you processing the files, it's impossible to provide an advice.

Upload your query.  

I asked to see how you are processing the files, this means you have to upload the query you are using, not the files.

 
Posted : 20/04/2020 3:56 am
(@ayemu7)
Posts: 197
Reputable Member
Topic starter
 

Thanks, Catalin,

I now attached script from advance editor. 

After those steps, XML columns are splited into different format. Highlighted XML files '1001 16 payment statement 022020.XML' and '1003 54 payment statement 022020.XML' are correct format. however others appear in different columns.

Name.1 for some XML files represent columns name from correct format.

Please see attached for reference.

Thank you.

Regards,

Aye

 
Posted : 20/04/2020 9:45 am
(@catalinb)
Posts: 1937
Member Admin
 

From the query, I guess that you are using the automatic combiner, which is not a good idea most of the times. I' sure you have received this advice before, so I wonder why you expect the automatic file combiner to be perfect 🙂

The answer is in your files, you have different structures, so you have to find what's different in those files and process them differently.

"Name.1" is there because it's hard typed in the query by the automatic combiner and also because the "Name" column already exists in the second column. (normally, if you build the transformation steps yourself, you should remove the "Name" column before expanding the "Name" column from table column content)

Don't rely on automatic steps, it will never read minds or fulfill wishes 🙂 . If you want things to work exactly as you want, you have to build the transformations yourself, from the beginning to the end.

Compare the xml files from rows 1 and 2, they will tell you what's the difference. If you still cannot find the solution, please prepare and upload sample xml files with both structures you have in the first 2 rows, of course, after you anonymize the data using a text editor, only a few rows of data will be enough.

 
Posted : 20/04/2020 10:51 am
(@ayemu7)
Posts: 197
Reputable Member
Topic starter
 

Hi Catalin,

Thank you.

Yes, I used automatic file combiner and expected to be perfect :).

What do you mean by I have to build transformations myself. Should I clean the data before upload?

I have about 100 files to upload every month and if the files are not in the same format, what would be my other option?

please see first 3 lines of XML files which combined differently in PQ.

Regards,

Aye

 
Posted : 20/04/2020 10:55 pm
(@catalinb)
Posts: 1937
Member Admin
 

Hi Aye,

The recipe is simple:

1. Process a single xml file (Get Data From xml file), the query Source step should look like:

Source = Xml.Tables(File.Contents("E:OneDrive...filename.xml"))

Perform all needed transformations: expand columns, other transformations

2. Convert this query to a function

To do this, you have to open the Advanced Editor.

Above the very first "let" statement, add a new row and put this:

(FilePath)=>

Go to the first step, Source, replace the path with the variable parameter:

Source = Xml.Tables(File.Contents(FilePath))

close Advanced Editor and save, make sure that this query name has a relevant name: MyNewFunction for example.

3. Start a new query, with "Get Data From Folder" option.

Simply add a new column, with this formula: =MyNewFunction([Folder Path]&[Name])

(This way, you will pass to the new function the path to each file.)

Select this new column and remove all other columns. This new column should have a table in each row. If there is an error in a specific file transformation, you will see "Error" instead of "Table" in that new column.

Expand this column, you should be able to see all your files data.

 
Posted : 21/04/2020 12:50 am
(@ayemu7)
Posts: 197
Reputable Member
Topic starter
 

Many Thanks, Catalin,

I will try and keep you posted.

Regards,

Aye

 
Posted : 21/04/2020 7:20 pm
Share: