In this article we will learn about how we can remove the duplicates items in Excel.
- Reverse Remove Duplicates In Excel Spreadsheets
- How Do I Remove Duplicates In Excel
- Reverse Remove Duplicates In Excel
- How To Reverse Remove Duplicates In Excel
The Remove Duplicates feature lives on Excel's ribbon on the Data tab. Specifically, you'll find the Remove Duplicates feature in the Data Tools section of the ribbon. Once you find it, simply click on it to launch the wizard. The Remove Duplicates feature is on the Data tab of the Excel ribbon, in the Data Tools section.
If our data is having large no. of data with duplicate entries, manual deletion would be a very big task and annoyance.
Excel is having in-built option to remove the duplicate entries from data. Remove Duplicate option is under the Data tab in the Data Tools group in Microsoft Excel 2010 and 2013.
- Hi Everyone, i have data like below. I want to remove duplicate row 'Data level & District & Cust ID'. Data Level District Sales Value Customer id District level Bagpat 1037764.267 0 District level Bagpat 1037764.267 0 District level Bagpat 1037764.267 0 District level Bijnor 1037764.26.
- Remove Duplicates from Column C by comparing from Column B Which has date. As, if the duplication is occurred on next date it should not be counted as not a duplicate because the same data is entered in the next date. Example: Customer has Purchased Item A on 1/1/2010 Customer has purchase Item A on 1/1/2010.
- This example teaches you how to remove duplicates in Excel. Click any single cell inside the data set. On the Data tab, in the Data Tools group, click Remove Duplicates. The following dialog box appears. Leave all check boxes checked and click OK. Excel removes all identical rows (blue) except for the first identical row found.
- This page describes how to remove duplicate rows in Excel, using three different methods. If you want to remove duplicate cells (rather than entire rows of data), you may find the Remove Duplicate Cells page more straightforward. In order to illustrate how to remove duplicate rows in an Excel spreadsheet, we will use the example spreadsheet below, which has data spanning three columns.
Example
Let's take an example and understand how we can use the Remove Duplicate option.
We have data in range A2:E13. In which column A contains month, column B contains Agent name, column C contains Country name, column D contains sales and column E contains Quantity.
Data is having duplicate entries. To remove the duplicate entries from data follow below mentioned steps:-
- Go to Data Tab, click on Remove Duplicates in Data Tools group.
- Remove Duplicates dialog box will appear.
- In the column box select the columns header on which basis you want to remove duplicate entries.
- In this data we want to remove duplicate entries on the agent name, we just select the Agent.
- After select the criteria as per the requirement of data click on ok.
- You will get an information popup to inform you that how many duplicates entries were found and how many unique values are left now.
This is the way you can remove duplicate entries from the large no. of data and can get the data with unique records.
Alternate idea
When a list contains a lot of values and we needed to confirm the list contains duplicate values or not. For this we will make a combination of excel functions and use the formula to get the desired result. Use the SUMPRODUCT function & COUNTIF function.
SUMPRODUCT function is a mathematical function in Excel. It operates on multiple ranges. It multiplies the corresponding arrays and then adds them.
Syntax:
= SUMPRODUCT ( list1, [list2], … ) |
COUNTIF function of excel just counts the number of cells with a specific condition in a given range.
Syntax:
= COUNTIF ( list, condition ) |
Formation of the formula:
First we need to check every element of the list with every other element of the same list. Then we will set a limit on the count by subtracting 1 from each occurrence and returns the required result.
= SUMPRODUCT ( COUNTIF ( list, list ) -1 ) > 0 |
Example:
Let's understand this formula using in the example shown below.
For this example, We took some names as a list in excel as shown below.
And we need to find if there's a duplicate value or not.
Use the Formula
= SUMPRODUCT ( COUNTIF( A2:A14 , A2:A14 ) -1 ) > 0 |
Explanation :
COUNTIF function counts the occurrence of each value in the list and returns an array of numbers, where number represents their counts
{ 1 ; 1 ; 1 ; 2 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 2 }
Subtract 1 from each value in the list.
{ 0 ; 0 ; 0 ; 1 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 1 }
Now the SUMPRODUCT function adds up all the value and returns TRUE if the result is greater than 1 or else it returns FALSE.
In the above snapshot the formula used to get the duplicates in a list in excel.
Blank cells in the list, may cause the formula to generate incorrect results. To ignore blank cells in the list use the below formula.
Use the formula if the list contains blank cells
= SUMPRODUCT ( ( COUNTIF ( list , list ) - 1 ) * ( list < > ' ' ) ) > 0 |
Here are some observational notes shown below.
Notes:
- The formula only works with numbers and text both.
- You can apply the formula over any column of the table where condition can be applied.
- Careful working with fixed reference and absolute array reference.
- Operators like equals to ( = ), less than equal to ( <= ), greater than ( > ) or not equals to ( <> ) can be performed within functions applied with numbers only.
Hope this article about How to Remove Duplicates in Excel is explanatory. Find more articles on Count formulas and related Excel tools here. If you liked our blogs, share it with your friends on Facebook. And also you can follow us on Twitter and Facebook. We would love to hear from you, do let us know how we can improve, complement or innovate our work and make it better for you. Write to us at info@exceltip.com.
Related Articles
How to use the SUMPRODUCT function in Excel: Returns the SUM after multiplication of values in multiple arrays in excel. This function can be used to do multiple tasks. This is one of the most versatile functions.
COUNTIFS with Dynamic Criteria Range : To count with dynamic criteria range we simply use the INDIRECT function. This function can
COUNTIFS With OR For Multiple Criteria : Count cells having multiple criteria match using the OR function. To put an OR logic in COUNTIFS function you will not need to use the OR function.
Using the IF with AND / OR Functions in Microsoft Excel : These logical functions are used to carry out multiple criteria calculations. With IF the OR and AND functions are used to include or exclude matches.
How to use OR function in Microsoft Excel : The function is used to include all the TRUE values in multiple criterias.
How to Count Cells That Contain This Or That in Excel in Excel :To cells that contain this or that, we can use the SUMPRODUCT function. Here's how you do those calculations.
Popular Articles :
How to use the IF Function in Excel : The IF statement in Excel checks the condition and returns a specific value if the condition is TRUE or returns another specific value if FALSE.
How to use the VLOOKUP Function in Excel : This is one of the most used and popular functions of excel that is used to lookup value from different ranges and sheets.
How to Use SUMIF Function in Excel : This is another dashboard essential function. This helps you sum up values on specific conditions.
How to use the COUNTIF Function in Excel : Count values with conditions using this amazing function. You don't need to filter your data to count specific values. Countif function is essential to prepare your dashboard.
Duplicate values in your data can be a big problem! It can lead to substantial errors and over estimate your results.
But finding and removing them from your data is actually quite easy in Excel.
In this tutorial, we are going to look at 7 different methods to locate and remove duplicate values from your data.
Video Tutorial
What Is A Duplicate Value?
Duplicate values happen when the same value or set of values appear in your data.
For a given set of data you can define duplicates in many different ways.
In the above example, there is a simple set of data with 3 columns for the Make, Model and Year for a list of cars.
- The first image highlights all the duplicates based only on the Make of the car.
- The second image highlights all the duplicates based on the Make and Model of the car. This results in one less duplicate.
- The second image highlights all the duplicates based on all columns in the table. This results in even less values being considered duplicates.
The results from duplicates based on a single column vs the entire table can be very different. You should always be aware which version you want and what Excel is doing.
Find And Remove Duplicate Values With The Remove Duplicates Command
Removing duplicate values in data is a very common task. It's so common, there's a dedicated command to do it in the ribbon.
Select a cell inside the data which you want to remove duplicates from and go to the Data tab and click on the Remove Duplicates command.
Excel will then select the entire set of data and open up the Remove Duplicates window.
- You then need to tell Excel if the data contains column headers in the first row. If this is checked, then the first row of data will be excluded when finding and removing duplicate values.
- You can then select which columns to use to determine duplicates. There are also handy Select All and Unselect All buttons above you can use if you've got a long list of columns in your data.
When you press OK, Excel will then remove all the duplicate values it finds and give you a summary count of how many values were removed and how many values remain.
This command will alter your data so it's best to perform the command on a copy of your data to retain the original data intact.
Find And Remove Duplicate Values With Advanced Filters
There is also another way to get rid of any duplicate values in your data from the ribbon. This is possible from the advanced filters.
Select a cell inside the data and go to the Data tab and click on the Advanced filter command.
This will open up the Advanced Filter window.
- You can choose to either to Filter the list in place or Copy to another location. Filtering the list in place will hide rows containing any duplicates while copying to another location will create a copy of the data.
- Excel will guess the range of data, but you can adjust it in the List range. The Criteria range can be left blank and the Copy to field will need to be filled if the Copy to another location option was chosen.
- Check the box for Unique records only.
Press OK and you will eliminate the duplicate values.
Advanced filters can be a handy option for getting rid of your duplicate values and creating a copy of your data at the same time. But advanced filters will only be able to perform this on the entire table.
Find And Remove Duplicate Values With A Pivot Table
Pivot tables are just for analyzing your data, right?
You can actually use them to remove duplicate data as well!
You won't actually be removing duplicate values from your data with this method, you will be using a pivot table to display only the unique values from the data set.
First, create a pivot table based on your data. Select a cell inside your data or the entire range of data ➜ go to the Insert tab ➜ select PivotTable ➜ press OK in the Create PivotTable dialog box.
With the new blank pivot table add all fields into the Rows area of the pivot table.
You will then need to change the layout of the resulting pivot table so it's in a tabular format. With the pivot table selected, go to the Design tab and select Report Layout. There are two options you will need to change here.
- Select the Show in Tabular Form option.
- Select the Repeat All Item Labels option.
You will also need to remove any subtotals from the pivot table. Go to the Design tab ➜ select Subtotals ➜ select Do Not Show Subtotals.
You now have a pivot table that mimics a tabular set of data!
Pivot tables only list unique values for items in the Rows area, so this pivot table will automatically remove any duplicates in your data.
Find And Remove Duplicate Values With Power Query
Power Query is all about data transformation, so you can be sure it has the ability to find and remove duplicate values.
Select the table of values which you want to remove duplicates from ➜ go to the Data tab ➜ choose a From Table/Range query.
Remove Duplicates Based On One Or More Columns
With Power Query, you can remove duplicates based on one or more columns in the table.
You need to select which columns to remove duplicates based on. You can hold Ctrl to select multiple columns.
Right click on the selected column heading and choose Remove Duplicates from the menu.
You can also access this command from the Home tab ➜ Remove Rows ➜ Remove Duplicates.
If you look at the formula that's created, it is using the Table.Distinct function with the second parameter referencing which columns to use.
Remove Duplicates Based On The Entire Table
Qm for mac. To remove duplicates based on the entire table, you could select all the columns in the table then remove duplicates. But there is a faster method that doesn't require selecting all the columns.
There is a button in the top left corner of the data preview with a selection of commands that can be applied to the entire table.
Click on the table button in the top left corner ➜ then choose Remove Duplicates.
If you look at the formula that's created, it uses the same Table.Distinct function with no second parameter. Without the second parameter, the function will act on the whole table.
Keep Duplicates Based On A Single Column Or On The Entire Table
In Power Query, there are also commands for keeping duplicates for selected columns or for the entire table.
Follow the same steps as removing duplicates, but use the Keep Rows ➜ Keep Duplicates command instead. This will show you all the data that has a duplicate value.
Find And Remove Duplicate Values Using A Formula
You can use a formula to help you find duplicate values in your data.
First you will need to add a helper column that combines the data from any columns which you want to base your duplicate definition on.
The above formula will concatenate all three columns into a single column. It uses the ampersand operator to join each column.
Reverse Remove Duplicates In Excel Spreadsheets
If you have a long list of columns to combine, you can use the above formula instead. This way you can simply reference all the columns as a single range.
You will then need to add another column to count the duplicate values. This will be used later to filter out rows of data that appear more than once.
Copy the above formula down the column and it will count the number of times the current value appears in the list of values above.
If the count is 1 then it's the first time the value is appearing in the data and you will keep this in your set of unique values. If the count is 2 or more then the value has already appeared in the data and it is a duplicate value which can be removed.
Add filters to your data list.
- Go to the Data tab and select the Filter command.
- Use the keyboard shortcut Ctrl + Shift + L.
Data is having duplicate entries. To remove the duplicate entries from data follow below mentioned steps:-
- Go to Data Tab, click on Remove Duplicates in Data Tools group.
- Remove Duplicates dialog box will appear.
- In the column box select the columns header on which basis you want to remove duplicate entries.
- In this data we want to remove duplicate entries on the agent name, we just select the Agent.
- After select the criteria as per the requirement of data click on ok.
- You will get an information popup to inform you that how many duplicates entries were found and how many unique values are left now.
This is the way you can remove duplicate entries from the large no. of data and can get the data with unique records.
Alternate idea
When a list contains a lot of values and we needed to confirm the list contains duplicate values or not. For this we will make a combination of excel functions and use the formula to get the desired result. Use the SUMPRODUCT function & COUNTIF function.
SUMPRODUCT function is a mathematical function in Excel. It operates on multiple ranges. It multiplies the corresponding arrays and then adds them.
Syntax:
= SUMPRODUCT ( list1, [list2], … ) |
COUNTIF function of excel just counts the number of cells with a specific condition in a given range.
Syntax:
= COUNTIF ( list, condition ) |
Formation of the formula:
First we need to check every element of the list with every other element of the same list. Then we will set a limit on the count by subtracting 1 from each occurrence and returns the required result.
= SUMPRODUCT ( COUNTIF ( list, list ) -1 ) > 0 |
Example:
Let's understand this formula using in the example shown below.
For this example, We took some names as a list in excel as shown below.
And we need to find if there's a duplicate value or not.
Use the Formula
= SUMPRODUCT ( COUNTIF( A2:A14 , A2:A14 ) -1 ) > 0 |
Explanation :
COUNTIF function counts the occurrence of each value in the list and returns an array of numbers, where number represents their counts
{ 1 ; 1 ; 1 ; 2 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 2 }
Subtract 1 from each value in the list.
{ 0 ; 0 ; 0 ; 1 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 1 }
Now the SUMPRODUCT function adds up all the value and returns TRUE if the result is greater than 1 or else it returns FALSE.
In the above snapshot the formula used to get the duplicates in a list in excel.
Blank cells in the list, may cause the formula to generate incorrect results. To ignore blank cells in the list use the below formula.
Use the formula if the list contains blank cells
= SUMPRODUCT ( ( COUNTIF ( list , list ) - 1 ) * ( list < > ' ' ) ) > 0 |
Here are some observational notes shown below.
Notes:
- The formula only works with numbers and text both.
- You can apply the formula over any column of the table where condition can be applied.
- Careful working with fixed reference and absolute array reference.
- Operators like equals to ( = ), less than equal to ( <= ), greater than ( > ) or not equals to ( <> ) can be performed within functions applied with numbers only.
Hope this article about How to Remove Duplicates in Excel is explanatory. Find more articles on Count formulas and related Excel tools here. If you liked our blogs, share it with your friends on Facebook. And also you can follow us on Twitter and Facebook. We would love to hear from you, do let us know how we can improve, complement or innovate our work and make it better for you. Write to us at info@exceltip.com.
Related Articles
How to use the SUMPRODUCT function in Excel: Returns the SUM after multiplication of values in multiple arrays in excel. This function can be used to do multiple tasks. This is one of the most versatile functions.
COUNTIFS with Dynamic Criteria Range : To count with dynamic criteria range we simply use the INDIRECT function. This function can
COUNTIFS With OR For Multiple Criteria : Count cells having multiple criteria match using the OR function. To put an OR logic in COUNTIFS function you will not need to use the OR function.
Using the IF with AND / OR Functions in Microsoft Excel : These logical functions are used to carry out multiple criteria calculations. With IF the OR and AND functions are used to include or exclude matches.
How to use OR function in Microsoft Excel : The function is used to include all the TRUE values in multiple criterias.
How to Count Cells That Contain This Or That in Excel in Excel :To cells that contain this or that, we can use the SUMPRODUCT function. Here's how you do those calculations.
Popular Articles :
How to use the IF Function in Excel : The IF statement in Excel checks the condition and returns a specific value if the condition is TRUE or returns another specific value if FALSE.
How to use the VLOOKUP Function in Excel : This is one of the most used and popular functions of excel that is used to lookup value from different ranges and sheets.
How to Use SUMIF Function in Excel : This is another dashboard essential function. This helps you sum up values on specific conditions.
How to use the COUNTIF Function in Excel : Count values with conditions using this amazing function. You don't need to filter your data to count specific values. Countif function is essential to prepare your dashboard.
Duplicate values in your data can be a big problem! It can lead to substantial errors and over estimate your results.
But finding and removing them from your data is actually quite easy in Excel.
In this tutorial, we are going to look at 7 different methods to locate and remove duplicate values from your data.
Video Tutorial
What Is A Duplicate Value?
Duplicate values happen when the same value or set of values appear in your data.
For a given set of data you can define duplicates in many different ways.
In the above example, there is a simple set of data with 3 columns for the Make, Model and Year for a list of cars.
- The first image highlights all the duplicates based only on the Make of the car.
- The second image highlights all the duplicates based on the Make and Model of the car. This results in one less duplicate.
- The second image highlights all the duplicates based on all columns in the table. This results in even less values being considered duplicates.
The results from duplicates based on a single column vs the entire table can be very different. You should always be aware which version you want and what Excel is doing.
Find And Remove Duplicate Values With The Remove Duplicates Command
Removing duplicate values in data is a very common task. It's so common, there's a dedicated command to do it in the ribbon.
Select a cell inside the data which you want to remove duplicates from and go to the Data tab and click on the Remove Duplicates command.
Excel will then select the entire set of data and open up the Remove Duplicates window.
- You then need to tell Excel if the data contains column headers in the first row. If this is checked, then the first row of data will be excluded when finding and removing duplicate values.
- You can then select which columns to use to determine duplicates. There are also handy Select All and Unselect All buttons above you can use if you've got a long list of columns in your data.
When you press OK, Excel will then remove all the duplicate values it finds and give you a summary count of how many values were removed and how many values remain.
This command will alter your data so it's best to perform the command on a copy of your data to retain the original data intact.
Find And Remove Duplicate Values With Advanced Filters
There is also another way to get rid of any duplicate values in your data from the ribbon. This is possible from the advanced filters.
Select a cell inside the data and go to the Data tab and click on the Advanced filter command.
This will open up the Advanced Filter window.
- You can choose to either to Filter the list in place or Copy to another location. Filtering the list in place will hide rows containing any duplicates while copying to another location will create a copy of the data.
- Excel will guess the range of data, but you can adjust it in the List range. The Criteria range can be left blank and the Copy to field will need to be filled if the Copy to another location option was chosen.
- Check the box for Unique records only.
Press OK and you will eliminate the duplicate values.
Advanced filters can be a handy option for getting rid of your duplicate values and creating a copy of your data at the same time. But advanced filters will only be able to perform this on the entire table.
Find And Remove Duplicate Values With A Pivot Table
Pivot tables are just for analyzing your data, right?
You can actually use them to remove duplicate data as well!
You won't actually be removing duplicate values from your data with this method, you will be using a pivot table to display only the unique values from the data set.
First, create a pivot table based on your data. Select a cell inside your data or the entire range of data ➜ go to the Insert tab ➜ select PivotTable ➜ press OK in the Create PivotTable dialog box.
With the new blank pivot table add all fields into the Rows area of the pivot table.
You will then need to change the layout of the resulting pivot table so it's in a tabular format. With the pivot table selected, go to the Design tab and select Report Layout. There are two options you will need to change here.
- Select the Show in Tabular Form option.
- Select the Repeat All Item Labels option.
You will also need to remove any subtotals from the pivot table. Go to the Design tab ➜ select Subtotals ➜ select Do Not Show Subtotals.
You now have a pivot table that mimics a tabular set of data!
Pivot tables only list unique values for items in the Rows area, so this pivot table will automatically remove any duplicates in your data.
Find And Remove Duplicate Values With Power Query
Power Query is all about data transformation, so you can be sure it has the ability to find and remove duplicate values.
Select the table of values which you want to remove duplicates from ➜ go to the Data tab ➜ choose a From Table/Range query.
Remove Duplicates Based On One Or More Columns
With Power Query, you can remove duplicates based on one or more columns in the table.
You need to select which columns to remove duplicates based on. You can hold Ctrl to select multiple columns.
Right click on the selected column heading and choose Remove Duplicates from the menu.
You can also access this command from the Home tab ➜ Remove Rows ➜ Remove Duplicates.
If you look at the formula that's created, it is using the Table.Distinct function with the second parameter referencing which columns to use.
Remove Duplicates Based On The Entire Table
Qm for mac. To remove duplicates based on the entire table, you could select all the columns in the table then remove duplicates. But there is a faster method that doesn't require selecting all the columns.
There is a button in the top left corner of the data preview with a selection of commands that can be applied to the entire table.
Click on the table button in the top left corner ➜ then choose Remove Duplicates.
If you look at the formula that's created, it uses the same Table.Distinct function with no second parameter. Without the second parameter, the function will act on the whole table.
Keep Duplicates Based On A Single Column Or On The Entire Table
In Power Query, there are also commands for keeping duplicates for selected columns or for the entire table.
Follow the same steps as removing duplicates, but use the Keep Rows ➜ Keep Duplicates command instead. This will show you all the data that has a duplicate value.
Find And Remove Duplicate Values Using A Formula
You can use a formula to help you find duplicate values in your data.
First you will need to add a helper column that combines the data from any columns which you want to base your duplicate definition on.
The above formula will concatenate all three columns into a single column. It uses the ampersand operator to join each column.
Reverse Remove Duplicates In Excel Spreadsheets
If you have a long list of columns to combine, you can use the above formula instead. This way you can simply reference all the columns as a single range.
You will then need to add another column to count the duplicate values. This will be used later to filter out rows of data that appear more than once.
Copy the above formula down the column and it will count the number of times the current value appears in the list of values above.
If the count is 1 then it's the first time the value is appearing in the data and you will keep this in your set of unique values. If the count is 2 or more then the value has already appeared in the data and it is a duplicate value which can be removed.
Add filters to your data list.
- Go to the Data tab and select the Filter command.
- Use the keyboard shortcut Ctrl + Shift + L.
Now you can filter on the Count column. Filtering on 1 will produce all the unique values and remove any duplicates.
You can then select the visible cells from the resulting filter to copy and paste elsewhere. Use the keyboard shortcut Alt + ; to select only the visible cells.
Find And Remove Duplicate Values With Conditional Formatting
With conditional formatting, there's a way to highlight duplicate values in your data.
Just like the formula method, you need to add a helper column that combines the data from columns. The conditional formatting doesn't work with data across rows, so you'll need this combined column if you want to detect duplicates based on more than one column.
Then you need to select the column of combined data.
To create the conditional formatting, go to the Home tab ➜ select Conditional Formatting ➜ Highlight Cells Rules ➜ Duplicate Values.
How Do I Remove Duplicates In Excel
This will open up the conditional formatting Duplicate Values window.
- You can select to either highlight Duplicate or Unique values.
- You can also choose from a selection of predefined cell formats to highlight the values or create your own custom format.
Warning: The previous methods to find and remove duplicates considers the first occurrence of a value as a duplicate and will leave it intact. However, this method will highlight the first occurrence and will not make any distinction.
With the values highlighted, you can now filter on either the duplicate or unique values with the filter by color option. Make sure to add filters to your data. Go to the Data tab and select the Filter command or use the keyboard shortcut Ctrl + Shift + L.
- Click on the filter toggle.
- Select Filter by Color in the menu.
- Filter on the color used in the conditional formatting to select duplicate values or filter on No Fill to select unique values.
You can then select just the visible cells with the keyboard shortcut Alt + ;.
Find And Remove Duplicate Values Using VBA
There is a built in command in VBA for removing duplicates within list objects.
The above procedure will remove duplicates from an Excel table named CarList.
Reverse Remove Duplicates In Excel
The above part of the procedure will set which columns to base duplicate detection on. In this case it will be on the entire table since all three columns are listed.
The above part of the procedure tells Excel the first row in our list contains column headings.
You will want to create a copy of your data before running this VBA code, as it can't be undone after the code runs.
Conclusions
Duplicate values in your data can be a big obstacle to a clean data set.
Thankfully, there are many options in Excel to easily remove those pesky duplicate values.
How To Reverse Remove Duplicates In Excel
So, what's your go to method to remove duplicates?