Notifications

Clear all

Duplicate data – how to keep the second entry?

Power Query

Last Post by Ironcladrooster 7 years ago

5 Posts

2 Users

0 Reactions

294 Views

RSS

Ironcladrooster

(@nondricek)

Posts: 4

Active Member

Topic starter

I’ve got a big report that comes out of my database which I’m try cull down and extract just a few values from. The report is set up to provide one year’s worth of data on each tab. The report includes things like beginning value, new capital inflows, % ownerships, and ending value for each month of the year. I just want to extract the final ending value for each month. The tricky thing is that the report oftentimes (but not always) lists two ending values for each month – a preliminary ending value which doesn’t account for some one-off cash flows, and a final, correct, ending value which does. I went ahead and loaded it in power query and shaped it so all I have left are the ending values for each month, but I’ve got dupes because I can’t figure out how to differentiate between preliminary and final ending values for each month. The final ending value always appears later in the list (so farther to the right in the raw data, or further down the list in my transposed power query data set), so I can use that to help, but the data set skips around so I can’t map it out based on “Is Even” or “Is Odd”. I think I’m looking for a way to remove duplicates but keep the last duplicate value in the set instead of the default of keeping the first duplicate value in the set. That would make sure I keep the final ending value – but then again there’s probably an even better way which I’m not even thinking of.

Attached is my file:

Tab #1 – raw data from my system for year #1 (yellow highlights indicate which data I need to keep)

Tab #2 – raw data from my system for year #2 (yellow highlights indicate which data I need to keep)

Note – there are a few minor differences between Tab #1 & Tab #2 in terms of the header order

Tab #3 – my desired final output

Tab #4 – my attempt at shaping the data in power query (it doesn’t quite get me all the way there, but it’s very close!)

Note – All the numbers are dummy data, so if the individual line items don’t tie to the totals out then that’s why.

Any thoughts or suggestions would be greatly appreciated. Thanks in advance!

Posted : 19/10/2018 7:18 pm

Mynda Treacy

(@mynda)

Posts: 4766

Member Admin

Hi,

Please see the attached file, query 'Mynda' with table on 'Desired Final Output' sheet. Note: it relies on there always being a preliminary and final ending value.

Mynda

Posted : 20/10/2018 2:59 am

Ironcladrooster

(@nondricek)

Posts: 4

Active Member

Topic starter

Hey Mynda - thanks for your help! This gets me 99% of the way there. Unfortunately that methodology cuts off the last two months of the "What I need #2 tab". November and December of those months don't have preliminary month end data, they only have final. Maybe it would be possible to insert an intermediate step in the query that counts the number of unique "Greek Letter-Date" pairs? If the pair has more than one hit then there's a dupe and we need to keep the second instance (with final data), but if there's only a single entry then we keep it instead? I'm just spitballing.

Thanks for your help.

Posted : 20/10/2018 9:58 am

Mynda Treacy

(@mynda)

Posts: 4766

Member Admin

In the 'Expanded Index' step it counts the instances of the 'Greek Letter-Date" pairs. I simply filtered out the 1st instances.

In the attached file, query Mynda (2), I changed the process slightly so it checked it there wasn't a value for the second record then it keeps the first record.

Mynda

Posted : 21/10/2018 7:21 am

Ironcladrooster

(@nondricek)

Posts: 4

Active Member

Topic starter

Perfect! Thanks so much!

Posted : 22/10/2018 5:53 pm

26 Forums
7,297 Topics
32 K Posts
16 Online
35.5 K Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

Duplicate data – how to keep the second entry?

Super Globals

Options and Features