Forum

Notifications
Clear all

Conditional Formatting & Deleting Duplicates

7 Posts
7 Users
0 Reactions
263 Views
(@Anonymous)
Posts: 0
New Member Guest
 

I am responsible for running a number of reports out of an AS-400 type system that houses our customer data.  It is NOT user friendly and I have to run 3 to 4 different reports - export them into Excel and then combine the reports so that the data that is needed is on a single spreadsheet. 

I believe there is a much quicker solution than what I am currently doing, but I'm not sure how to do it.  I need to 1st identify duplicate account numbers (this indicates multiple services).  I use the conditional formatting button and highlight duplicates.  I then go back over the highlighted cells and "fill them" with a color so that when I ask for duplicates to be deleted I still know which accounts have multiple services. 

After I fill the cells - and there can be as many as 50k rows; but before I click on remove duplicates, I create a copy of the spreadsheet & create a pivot table to get the actual account total.  I then go to the original spreadsheet, where I highlighted the cells manually, and then click the icon to delete all duplicates. 

Before I started to fill the cells manually, when I would click the icon to delete all duplicates my conditional formatting disappears, as there are no more duplicates. 

In one spreadsheet I need to show that the accounts has duplicates, (without having to actually keep them on the final spreadsheet), along with the actual account total (which I put on the final spreadsheet using VLOOKUP and the pivot table.

Any and all suggestions are welcome...

 
Posted : 25/06/2016 10:26 am
(@pray4wisdom)
Posts: 1
New Member
 

This sounds like a job for a VBA macro.  (Unfortunately, I am just beginning to learn, so I don't have specific suggestions.)

 
Posted : 25/06/2016 11:10 am
(@jstephens)
Posts: 5
Active Member
 

It looks like you are completing multiple tasks to get a final result, correct?  And to understand completely, you are keeping two sheets, one with the complete dataset (duplicates and all) and other with the duplicates already eliminated.  Is this correct?  Not sure how you need to present your results but maybe you can eliminate some steps by doing the following:

In your sheet where you need to eliminate the duplicates, I don't think you need any conditional formatting.  If you press the "Remove Duplicates" icon (assuming the picture below is what you are referring to as the "Remove Duplicates Icon".

Remove-Duplicates.jpg

You then can compare the results with your complete dataset.  If you need to show how many duplicates there are (or need to show which account numbers came up duplicate), you can simply run a pivot table and add a count values field.  Sort it from largest to smallest and anything with greater than 1 will be on the top of your results.  You have your list.

Does that help?

 
Posted : 25/06/2016 11:15 am
(@bigroo)
Posts: 16
Eminent Member
 

Hi Jennifer Silva,

If you wish to highlight the account numbers that are duplicate, I have used this formula successfully.

Add this to a helper column, =MATCH(C2,$C$1:$C$5,0)=ROW() It evaluates to True for the first occurrence of an item and False for subsequent occurrences.

Works nicely!

 
Posted : 25/06/2016 4:25 pm
(@fra68ve)
Posts: 1
New Member
 

Hi Jennifer,

what you are doing with remove duplicates and keeping the total amount of accounts, sounds like something that could be easily done with PowerQuery...

It does worth to have a look on that...

 

Regards

Franz

 
Posted : 27/06/2016 7:40 am
(@rufus46)
Posts: 2
New Member
 

I had a similar situation with a huge number of lines and columns of irrelevant data. I would spent about 40 minutes each time I had to knock off a report because (of course) the data kept changing.

I used to sort all the data and then extract it by cutting and pasting into another report. It was a huge time consumer.

I decided one day that there had to be a way I could do it using macros. It took me quite a while to get them to do exactly what I wanted but after a while I had something that reduced the process to 10 minutes. Because there were so many lines of macros I knew I had steps that were unnecessary in the code.

After a few months of using it I decided to rewrite my code. In the time I'd been watching it process stuff I also realized I could do things faster by using different approaches.

My rewrite could process the same information in (believe it or not!) 10 seconds.

I would urge you to consider using macros to do what you want. Mine is probably a programmers 'dogs breakfast' but it works. I no longer have to worry if I have enough time to knock off the report I'm after.

Another thing I would suggest is that elimination of data is always a bit dangerous - try extracting what you want into a separate sheet and processing it there.

 
Posted : 27/06/2016 12:57 pm
(@mynda)
Posts: 4761
Member Admin
 

Power Query is the new tool for cleaning data. Much better than a VBA approach IMO.

 
Posted : 28/06/2016 2:03 am
Share: