- How to find and remove duplicate rows in Excel?
- How to Check for Duplicates in Excel?
- How to delete duplicates in Excel but keep one?
Excel spreadsheets continue to represent a key tool for data storage and visualization. Functionalities such as Find & Replace or Sort help users speed up repetitive tasks that would otherwise be time-consuming and inefficient. Just like working on a spreadsheet with blank rows or cells that interfere with the correct application of rules and formulae, duplicate data can cause similar issues.
In this post, you will learn different ways to find duplicate values to either highlight this information or delete as many duplicates as needed. From more basic highlighting features to more advanced filtering options, you’ll learn how to work with the full potential of the desktop version of Excel.
If you want to avoid duplicate data entry in Google Sheets, you can do that easily using Layer. Layer is a free add-on that allows you to share sheets or ranges of your main spreadsheet with different people. On top of that, you get to monitor and approve edits and changes made to the shared files before they're merged back into your master file, giving you more control over your data.
Install the Layer Google Sheets Add-On today and Get Free Access to all the paid features, so you can start managing, automating, and scaling your processes on top of Google Sheets!
How to find and remove duplicate rows in Excel?
The various methods shown in this article will first find the duplicate values to be removed and then show how to delete them. This two-step process is crucial, especially considering that you may not want to delete the duplicates automatically and keep only the unique value. Let’s look at the first method to remove all duplicates.
How to Check for Duplicates in Excel?
How to remove duplicates using the Remove Duplicates feature?
What is the shortcut to removing duplicates in Excel? The shortcut is actually a built-in command available in the ribbon, which you can use in the following way.
- 1. Open your Excel spreadsheet and select any range in your spreadsheet which you want to delete duplicate rows from.
- 2. Go to Data > Remove duplicates.
If you haven’t selected all data in your spreadsheet, Excel will give you the option of expanding the search to the entire document, which is recommended. Click “OK”.
- 3. In case your data selection has headers, tick the column boxes that contain them so as not to be counted in the duplicate search. All columns in my example contain headers, so I’ll leave all boxes ticked. Click “OK”.
- 4. Excel prompts you with a dialog box informing you about the exact number of duplicate values it found and removed, as well as the number of unique values remaining in your spreadsheet.
How to Combine Multiple Excel Columns Into One?
There are many ways to combine multiple columns into a single column in Excel. Here's how to do it without losing any dataREAD MORE
How to delete duplicates in Excel but keep one?
Although the previous method is helpful at targeting all duplicates, this means that the unique data will also be permanently deleted. To avoid this, you may want to explore the following methods.
Here’s how to delete duplicates in Excel but keep one; we strongly recommend that you always keep a copy spreadsheet in case you want to go back to the original dataset.
How to remove duplicates using the Advanced Filter option?
This is a straightforward way to get rid of any duplicate content without deleting them entirely; instead, the Advanced filter option hides your duplicates from your dataset.
- 1. Select a cell in your dataset and go to Data > Advanced filter to the far right.
- 2. Choose to “Filter the list, in-place” or “Copy to another location”. The first option will hide any row containing duplicates, while the second will make a copy of the data.
Leave the “List range” field empty, if you want Excel to list it automatically. You can also leave the “Criteria range” empty. The only mandatory field to fill out is the “Copy to” if you selected the “Copy to another location” option.
- 3. Tick the “Unique records only” box to keep the unique values, and then “OK” to remove all duplicates.
Advanced filters are an excellent way to remove duplicate values while keeping a copy of the original data. Don’t forget that the Advanced filter option only applies to the entire table.
How to remove duplicates using Excel formulae?
Although you can combine various formulae to remove duplicates in Excel, in 2018, Microsoft integrated the UNIQUE formula to make this process much easier. First, let’s explore the syntax of the UNIQUE formula:
=UNIQUE (array, [by_col], [exactly_once])
- array refers to the range of cells we will extract unique values from and represents the only required argument.
- [by_col] is an optional parameter determining the search for unique values by rows or columns.
- [exactly_once] is the other optional parameter and sets the behavior for values that appear more than once. If you want the formula to return items that appear exactly once, then write “TRUE”; however, if you want it to return every distinct item, then write “FALSE”.
Let’s now apply the =UNIQUE formula to our dataset.
- 1. Enter the formula next to the set of data. You can either leave one column in between or place it directly next to the last data column. Like in most Excel formulae, as soon as you type at the beginning of the formula, the rest will prompt automatically. Select the range you want to apply the formula to.
- 2. You can leave the second parameter [by_col] by simply including the comma before and after its place. Let’s first see what happens when we include “TRUE” for the [exactly_once] parameter.
- 3. As soon as you press the Return key, Excel removes all duplicates. In this example, it has removed rows 5 and 6.
Let’s see how by including “FALSE” as the last parameter, Excel will keep the unique value.
- 1. Follow the previous steps, and now wrote “FALSE”, to return every distinct value.
- 2. Now, the UNIQUE formula has returned row 5 and only deleted the duplicate value in row 6.
How to remove duplicates using conditional formatting?
Conditional formatting is an Excel feature that helps users filter, sort, and organize data according to built-in rules or custom ones created by the user. The most common feature is the “Highlight Cell Rules”, which allows you to format cell values according to color, font, and various other format styles. Although this method won’t directly remove duplicates, it will make them extremely clear to identify.
- 1. Select the range of cells you want to apply the conditional formatting rule to. Then go to Home > Conditional Formatting > Highlight Cell Rules > Duplicate Values.
- 2. Set the “Style” to “Classic” and then “Format only unique or duplicate values”. Don’t forget to leave the drop-down menu to “duplicate”. Finally, choose the formatting style using the “Format with” drop-down menu. Click “OK”.
- 3. You can see how Excel highlights all duplicate values, including the cells. This means that you will need to make sure to only remove rows unless you are actually interested in removing all duplicates.
In case you want to highlight rows, you can combine all row values in one cell using the =CONCAT formula; if you would like to learn more about this function, read this article on the Microsoft support page.
With Power Query, you can import data from different sources and prepare the data for analysis. Here's how to use Power Query in Excel.READ MORE
How to remove duplicates based on one or more columns in Excel?
As a more advanced use of Excel, you can remove duplicates based on one or more columns using Power Query. This feature allows you to select the columns you would like to remove the duplicates from. Let’s explore how to use Power Query to remove duplicates based on one or more columns.
- 1. Go to Data > Get Data (Power Query).
- 2. Choose “Excel workbook” as your data source.
- 3. Browse through your files and select the spreadsheet you want to apply the Power Query function to. Click “Next”.
- 4. Tick the checkbox next to the worksheet containing your data (located in the left-side menu). Then, click “Load” in the bottom right-hand corner.
- 5. As you can see, the dataset has been transformed into a table.
- 6. Select the columns to apply the Power Query to by pressing Ctrl/Cmd + click on the columns.
- 7. To delete duplicates, simply click on “Remove Duplicates” in the “Data” tab. Then click “OK” in the pop-up dialog box.
- 8. Excel will inform you about the number of duplicates removed and how many unique values remain.
Don’t worry about removing all duplicates, since the dataset you worked on is a copy created by the Power Query function. However, if you want to keep unique values, follow the steps outlined in the sections on the Advanced Filter option or =UNIQUE formula in Excel.
As we have seen, there are many ways to identify and eliminate duplicates in your data, depending on your needs. Not only can you now successfully organize your data correctly, but removing duplicates makes it easier to identify key patterns and create accurate reports, particularly when working with larger datasets.