Forum

Notifications
Clear all

Remove Duplication

10 Posts
4 Users
0 Reactions
63 Views
(@bill-jone)
Posts: 28
Eminent Member
Topic starter
 

Hi,

I have Codes exceed 4Million and I want to remove the duplication so I put it in 4 columns due the excel have count limitation to be added in one column.
So can help me to remove the duplication from all columns at one time.

Thanks;

Bill

 
Posted : 01/10/2020 3:22 pm
(@mynda)
Posts: 4761
Member Admin
 

Hi Bill,

Why don't you use Power Query to do this for you? Presumably the data is in a CSV or Text file in one column. You can use Power Query to get the data from the CSV file:

1. Excel 2010 & 2013 go to the Power Query tab/Excel 2016 onward go to the Data tab 

2. Get Data from Text/CSV

3. Browse to the file location and import it.

4. In the Power Query editor window's the Home tab > Remove Rows > Remove Duplicates > Close & Load.

Mynda

 
Posted : 01/10/2020 7:03 pm
(@bill-jone)
Posts: 28
Eminent Member
Topic starter
 

Power-Query-1.pngHi Mynda,

Thank you very much for your fast reply.

Could you please help me about this, I'm using Excel 2010 and I can't find Power Query tab, please check attached image and let me know how I can get it.

Thanks;

Bill

 
Posted : 02/10/2020 5:14 am
(@mynda)
Posts: 4761
Member Admin
 

Hi Bill,

In Excel 2010 Power Query is a free add-in that you can download here: https://www.microsoft.com/en-us/download/details.aspx?id=39379

When installed you'll have a dedicated Power Query tab on the ribbon.

Mynda

 
Posted : 02/10/2020 6:07 am
(@bill-jone)
Posts: 28
Eminent Member
Topic starter
 

Hi Mynda,

Please accept my gratitude. Your support means a lot to me!.

I got Power Query and I got how to remove the duplication as you see in attached image but I still need your help to know how I can download or save in text file  after the duplication data removed.

Excel.png

 

Thanks;

Bill 

 
Posted : 10/10/2020 12:05 pm
Philip Treacy
(@philipt)
Posts: 1629
Member Admin
 

Hi Bill,

When your data is ready click on Close & Load in the top left of the query editor. NOTE click on the text Close & Load, not the icon.

From the sub-menu click on Close & Load To.  Then choose to load to a table on an existing or new worksheet.

If you want to save the data out as text from there you can do so.

Regards

Phil

 
Posted : 27/10/2020 12:03 am
(@bill-jone)
Posts: 28
Eminent Member
Topic starter
 

Deep thanks Philip.

I did this but when I save data as text, I see after I save it as text, the count not correct.

It is supposed the count is 3,407,267 but the file text contain1,048,576. please check attached images.

 

Counts.png2020-10-28-13_31_57-Window.png2020-10-28-13_46_40-Window-1.png

 

Thanks;

Bill

 
Posted : 29/10/2020 7:44 am
Philip Treacy
(@philipt)
Posts: 1629
Member Admin
 

Hi Bill,

How do you know the count is wrong?

Can you please share the Excel workbook and the source data file(s) so I can check.

Thanks

Phil

 
Posted : 29/10/2020 10:38 pm
(@catalinb)
Posts: 1937
Member Admin
 

Hi Bill,

1048576 is the maximum number of rows in excel.

Obviously, we cannot add in a sheet more that this number of records, so we have to find an alternative.

Here i what you can do:
1. Add a line in the query to split the table in 1.000.000 records buckets.

split=Table.Split(#"PreviousStepName",1000000),

2. Convert this list to table:

#"Converted to Table" = Table.FromList(split, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

3. Add an index column, starting from 1, step 1.

#"Added Index" = Table.AddIndexColumn(#"Converted to Table", "Index", 1, 1, Int64.Type),

4. Expand Column1. This action will regenerate the initial data, but will have now a paging column. If the data has over 3 million rows, you will have in that column numbers from 1 to 4, each page will have 1.000.000 records, as we set in step 1.

5. Instead of Load To a table in a worksheet, choose to load to a Pivot Table Report.

6. In this pivot table, add a slicer for the column we added for paging. This will allow you to see data in pages, 1 million at a time.
You can also show report pages, that will create 1 sheet for each page.

 
Posted : 29/10/2020 11:55 pm
(@catalinb)
Posts: 1937
Member Admin
 

Example attached.

You can set the csv file path and group size to split into as many pages you need.

https://1drv.ms/x/s!AjfS33R8yoG9ksIBel-Zcg96s9GZCA?e=IQF9dC

File size is too large, choose the Open In Desktop App option.

 
Posted : 30/10/2020 12:27 am
Share: