Forum

Power Query - Add n...
 
Notifications
Clear all

Power Query - Add new data and modify existing

10 Posts
2 Users
0 Reactions
175 Views
(@teekayjunior)
Posts: 16
Eminent Member
Topic starter
 

I have a workbook to track collection activity for our delinquent accounts.  It will be updated weekly with new data. Here is the general flow:rnrn Start with a worksheet that contains all severely delinquent invoices and several data fields from our billing platformrnAdd or modify various fields to reflect collection activities, status, comments, and the likernAdd new data on a weekly basis while maintaining the modifications from the previous steprnRepeat step 2rnRepeat step 3rnWash, rinse, and repeat each weekrnrnIt's the handling of the changes after the first append that has me stuck.  From week 2 forward, I don't want to append the new data to the original worksheet. Each week  I want to append the new data to the modified version of the previous week's appended query.  The accumulated history to which I want to append new data each week is not static. It's a bit of a moving target. rnI hope that makes sense.  Any help is greatly appreciated.  

 
Posted : 14/09/2023 2:05 pm
Riny van Eekelen
(@riny)
Posts: 1200
Member Moderator
 

I believe you need to use a self-refencing query. The very first time, you connect to the data, do the basic transformations (if needed) and load the table back to Excel. Then you add comments to that PQ generated table.

Next time you connect to the new data AND the existing PQ table with the added comments and merge the two based on unique identifier (probably the invoice number). Both steps should be built into one query. Hence, a self-referencing query. 

If you can share a file with anonymised data for week 1, comments to be added and data for week 2 with etc. Then it's probably not all that difficult to help you on your way.

 
Posted : 15/09/2023 1:26 am
(@teekayjunior)
Posts: 16
Eminent Member
Topic starter
 

Thank you, Riny!  A sample file is attached. You can ignore the TK Test and Current worksheets.  The relevant worksheets are Current Data, New Data, and Updated Data.

Current Data is ground zero.  The file is updated weekly.  The source for this file provides columns D-G.  The remaining columns are the ones that will capture comments, status, relevant dates etc.  

Each week, New Data is brought in.  Some of that info is duplicative (except for the updates).  The files are appended in PQ and the duplicate records are eliminated, leaving the previous data, including updates) and the new records.  The new records are then updated and the process repeats weekly. 

I would consider my proficiency with PQ to be slightly better than average.  In other words, I know enough to be dangerous. You won't hurt my feelings with any critique of what I've done to this point!

Many thanks for your help.

 
Posted : 15/09/2023 9:50 am
Riny van Eekelen
(@riny)
Posts: 1200
Member Moderator
 

Thanks for the file. I created a self referencing query, but before I could do that I simulated a 'first data set' based on the first four columns of your Current Data. Note that I cleaned it up the as the table contained two total rows. Furthermore, I noticed that Short_ID's ABC123 and EWL100 dealt with the same invoices. I removed ABC123 (that had no comments at all in the original data) and re-labeled EWL100 to ABC123.

 

In the attached workbook you find two new tabs (green shaded). tData is where you will dump new data in the blue table every week. Columns J:M contain the initial data I worked with.

Then I did what I described earlier. Connect to tData and load it into tData_2. That created only 4 columns. Then I added the extra columns as indicated. Next step was to dump the new data in tData and press refresh. Now tData_2 contains only the items from the latest data dump plus all the extra comments these had after the previous run.

finally, there are some invoices at the bottom of tData_2 marked red. These are duplicate invoices but with different Short_ID. I suspect these are similar to ABC123 and EWL100 and slipped through when you created this example data set.

See if you can get this to work with real data. If not come back here.

 
Posted : 16/09/2023 5:09 am
(@teekayjunior)
Posts: 16
Eminent Member
Topic starter
 

Thanks so much Riny!  I'll take a look and let you know.  Very much appreciated.

 
Posted : 19/09/2023 10:48 am
(@teekayjunior)
Posts: 16
Eminent Member
Topic starter
 

Riny,

Thanks again for helping me with this and for being patient with my modest PQ skills!  

The Updated Data worksheet reflects the following:

  1. Original data (first 5 columns)
  2.  Original data updated (remaining 11 columns)
  3. New Data added

My next step, and where I'm getting hung up, is what to do after updating the latest Updated Data worksheet ahead of the next week's addition of New Data.  I know it has something to do with your comments about a "self-referencing query."  I think I understand that phrase in theory but I definitely don't know what it means in practice.  I know, in effect, my new Updated Data needs to become the new Original Data (to maintain the updates before adding new data).  If there's an elegant way, I'm not seeing or understanding it.  Worst case (but not too bad, if it comes to this) is I could delete old Original Data and paste in the Updated Data.  

I hope my questions are making sense. Again, thanks for helping and bearing with me!

Tom

 
Posted : 19/09/2023 3:17 pm
Riny van Eekelen
(@riny)
Posts: 1200
Member Moderator
 

Difficult to explain, but perhaps it helps if you just think about Data (tData) and Annotated Data (tData_2). Forget about the other tables in your file.

The very first time you connect to the data (tData) and load it back to Excel (tData_2)

Then you annotate tData_2 by adding notes extra columns.

Next week you dump the updated data from your accounting system (the first five columns only) into tData and press Refresh All.

Now, PQ will take tData and merge it with tData_2 and load that back to tData_2. Leaving you with the items for the current week's list (=tData) and their notes from the old tData_2.

 
Posted : 20/09/2023 4:30 am
(@teekayjunior)
Posts: 16
Eminent Member
Topic starter
 

Riny, you're a gem for bearing with me! I may have messed up your queries, as my results are not retaining the annotations when I add new data to tData.  

I've added worksheets Data 1 and Data 2.  Data 1 is the original data before annotation.  Data 2 is the next week's data to be dropped in tData.

I cleared tData and tData_2 so I could start from scratch.  I added the Data 1 data to tData and refreshed all.  tData_2 correctly reflected the data.  I annotated tData_2 (just putting the text "Test" in several cells). I cleared tData of the original data and dropped in the new data from Data 2.  When I refreshed all, tData_2 properly reflected the new data, but without retaining the annotations.

Thanks again!!

 
Posted : 20/09/2023 12:01 pm
Riny van Eekelen
(@riny)
Posts: 1200
Member Moderator
 

Let's start from scratch on a clean file and with a slightly different approach. I've attached 4 files demonstration each step.

Step 1: Connect to tData and load to Excel. This creates table tData_2

Step 2: Add headers for the extra columns to tData_2 and add some comments.

Step 3: paste Data 2 into tData. Now, add two steps to the tData query. These steps will merge with and expand data from tData_2 as created in step 2. Refresh and note that the extra columns get duplicated. This happens only the first time you do this.

Step4: Delete these duplicated columns (at the far right). Update tData with Data 3 (added it just as an example). Refresh

Now you are all set for the next week!

 
Posted : 21/09/2023 3:21 am
(@teekayjunior)
Posts: 16
Eminent Member
Topic starter
 

Riny,

Your solution works superbly!!  I can't tell you how much I appreciate the effort and your patience with me.  

Kind regards,

TomSmile

 
Posted : 21/09/2023 4:11 pm
Share: