Hello all
The data I download dumps it all in an Excel worksheet (no tables or named ranges) very much like the data Mynda receives for the web traffic to her website (see video 4.03). As I added data to folders, Power Query started to throw errors which I couldn't understand then I realised that the data starts on different rows depending on how much data has been downloaded e.g. data set 1 had 10000 rows of data and would begin with the column headers on row 9 but on smaller data sets the column headers would begin on row 8. I was using the 'remove rows function' on the sample file to remove the first 8 rows and promoting the header but then this wasn't working when the data layout was different as outlined above.
Is there anyway to adapt the transformation/combine process to get around this problem? The common theme with the data I am downloading is that they have the following row headings (I've inserted the excel cell references for ease of reference): -
A1 Screen Data Export
A2 Option
A3 Username
A4 Date
A5 Time
The report details are then listed in column B. Finally, the title of the report begins at A8 on most reports but this is the variable aspect of the report. It might be at A7 (there doesn't seem to be any pattern to why it varies apart from number of transactions in the data). The actual column names and the data I want are always below this.
I'm wondering if instead of using the 'remove rows function', I could filter the data in the sample file to get rid of the stuff I don't need but I'm not sure.
Thank you
Tim
Hi Tim,
It's tricky to say without seeing your file, but as I read your message I thought of trying remove blanks and then promote headers instead of removing n number of rows.
If that doesn't work, or creates knock on problem, please share some sample files that illustrate your data.
Mynda
Hi Mynda
Thanks for your response. Please find attached an upload that should make it a bit clearer. The excel workbook has two tabs, one is a download by transaction and the other is by GL code. In fact, I think I've discovered the issue: it seems that the transaction data columns begin on row 8 and the GL data columns begin on row 9. I copied a query I had created for GL code data and used it for transaction data and that's why I was getting problems. That being said, I'm still a little hesitant to use the remove rows function now in case this pattern for the transaction data and GL code data is not uniform. Would you suggest another way that future proofs my queries just in case the point at which the columns begin are not the same?
Thanks again
Tim
Hi Tim,
Trying to allow for a variable number of rows at the top is not a trivial exercise. However, I think it's highly likely that the reports will consistently start on the same rows because the header data (option, username, date, time) appear to be fixed fields, as opposed to variable transaction type data that might change from time to time.
I wouldn't worry about allowing for a variable number of rows. Of course as with any report, you'll have your checks and balances in place to ensure if for some reason the data isn't imported correctly you'll know about it.
Mynda
Hi Mynda
Thanks for your response. As I said in my previous message, I think the issue has arisen due to different reports dumping the data on a slightly different row. I'd copied a query for GL code data for the transaction data which was causing the problem. Now that I'm aware of this, I think I should be ok.
Thanks again
Tim