Features of !SEMTools

Finding duplicate values in Excel columns is not an easy task. There are a couple of built-in tools, such as conditional formatting and the tool to remove duplicates, but they are not always suitable for real-world tasks.

Finding duplicates in Excel can be very different, and, depending on the input, will be done differently as well.

The key points are:

  • What exactly are the duplicates – repeating words in cells, repeating cells themselves, or repeating rows?
  • If cells, then:
    • Which cells are we willing to count as duplicates – all but the first or including it?
    • Do we count as duplicates rows that differ only by spaces before/after words or extra spaces between words?
    • Where do we look for duplicates – in one column, in two columns, or in several columns?
    • Or maybe we need to find implicit duplicates?

First, let’s look at some simple examples.

Find and highlight duplicates in Excel

The conditional formatting tool is good for highlighting duplicate cells. The procedure has a number of ready-made rules, including for repeating values.

You can find the tool in the “Home” tab of the program:

The procedure is intuitive:

  1. We select the range in which we want to find duplicates.
  2. Call the procedure.
  3. Select the formatting for the selected cells (there are preset formats or you can specify your own option).

It is important to understand that the procedure finds duplicates within the entire range and therefore may not be applicable for comparing two columns. It is enough to have duplicates inside one column – and the procedure will highlight them both, although there will be none in the second column:

This behavior is not obvious, and this fact is often forgotten. If you plan to delete duplicates next, you might lose both variants in the same column.

How can you avoid this situation if you want to find exactly the duplicates in the other column? The simplest solution: remove duplicates within each column before applying conditional formatting.

But there are other solutions. More about them below.

Formula to check for duplicates in ranges

Using your own formula to check for duplicates in a list or range has a number of advantages, the only problem is making up such a formula. But I’ll take it on myself.

Inside a range

To check if there are duplicate values in a range, you can use an array formula like this:

=SUMPRODUCT(COUNTIF(range,same-range)-1)>0

This is what application of the formula looks like in practice:

What is the advantage of this formula, because it completely duplicates the conditional formatting option, you may ask.

And the point is that the formula is easy to modify and improve.

For instance, you can improve the formula’s efficiency by adding the TRIM function to it. This will allow you to find duplicates that only differ by extra spaces:

=SUMPRODUCT(--(TRIM(cell)=TRIM(range)))>1

This formula is slightly different because it checks the occurrence of a single cell value in the range.

If you put it in as a conditional formatting selection rule, it will allow you to detect implicit duplicates. Below is a demonstration of how the formula works:

Note one point in this demonstration: the range is fixed ($A$1:$B$4), but the cell you are looking for (A1) is not. This is what allows conditional formatting to find all duplicates in the range.

SEMTools, search for duplicates within a range

I once spent a lot of time using the methods listed above to find duplicate values. I didn’t like all of them. The reason was the same: It was simply too time-consuming.

So, I decided to make several instruments for finding duplicates in my Excel add-in. Let me show you how they work.

Find duplicate cells in a column except the first one

The procedure allows you to highlight all the second, third, etc. duplicate values in the column.

Find duplicate cells in a column, including the first one

It is often necessary to find all duplicate cells in a column, including the first cell, in order to further filter them all out.

Finding duplicates in a column with no extra spaces

If we consider phrases that differ in the number of spaces between words or after them as duplicates, our task is first to get rid of extra spaces, and then do the same search for duplicates.

For the first operation, there is a separate tool “Remove extra spaces”:

This post is also available in KZ and RU.