Features of !SEMTools

Regular expressions in Excel | Find, extract and replace text

Regular expressions in Excel with !SEMTools

Many people have heard of regular expressions, but not everyone knows that Microsoft Excel supports them “under the hood.” Regular expressions allow you to work with text much faster, find complex patterns, and solve advanced analytical tasks. The only problem is that in standard Excel, you can use them only through VBA programming.

Why Microsoft hasn’t added them as worksheet functions — and whether they ever will — remains unknown.

With the !SEMTools add-in, you don’t need to know VBA at all. A basic understanding of regular expression syntax is enough to solve tasks that are nearly impossible with standard Excel functions, or that would require extremely long formulas. Examples of such “mega formulas” can be found here:

Functions available with regular expressions

With !SEMTools installed, you can use three regular expression functions directly in Excel: REGEXMATCH, REGEXEXTRACT, and REGEXREPLACE.

Their syntax and behavior match Google Sheets, so formulas will be fully compatible between the two.

REGEXMATCH

Returns TRUE or FALSE depending on whether the text matches a pattern.

=REGEXMATCH("text","pattern")

REGEXEXTRACT

Extracts the first piece of text matching the pattern. In Google Sheets, if no match is found, it returns an error. In !SEMTools, it returns a blank cell instead.

=REGEXEXTRACT("text","pattern")

REGEXREPLACE

Replaces all matches of the pattern with the specified text.

=REGEXREPLACE("text","pattern","replacement")

Real-life examples of using regular expressions

Below are practical examples, from simple to more advanced, to show the real power of regular expressions in Excel when using !SEMTools.

Extracting content before or after the first number

Example patterns:

  • Before and including first number: .+\d
  • From first number onward: \d.+

Here, “.” means any character, “+” means “repeat as many as possible,” and “\d” means any digit.

Extracting all digits

Pattern for a single digit: \d or [0-9].

Pattern for sequences of digits: \d+ or [0-9]+.

Extracting only 4-digit numbers (years)

Several options:

  • \d{4}
  • [0-9]{4}

The curly braces specify exactly how many characters must match.

Extracting Latin letters

Pattern: [a-zA-Z] — matches all lowercase and uppercase English letters.

Checking if a cell starts with a number

=REGEXMATCH(A1,"^\d.*")

“^” means start of the cell, “\d” means a digit, “.*” means any characters after that.

Replacing by pattern

Common use cases:

  • Remove all numbers from text
  • Remove punctuation
  • Remove all characters except letters and numbers

!SEMTools also supports replacing accented or special characters with English equivalents, which is useful for creating clean URLs from multilingual text.

Splitting text by characters or by letters/numbers

Pattern to split into individual characters: .

Pattern to split numbers from letters: ([^\d.]+|[\d.]+)

Inserting text after the first word

You can replace only the first match of a space with your chosen text. This is not possible in Excel’s standard Find and Replace, but is available in !SEMTools RegEx Replace.

Working with whole words

Example: extract words containing both letters and numbers.

([a-z]\d|\d[a-z])

Here, “|” means “or,” so the pattern matches either a letter followed by a digit or a digit followed by a letter.

Cleaning cells that don’t match a pattern

You can remove any cell content that doesn’t meet your pattern requirements, and leave intact:

  • Only cells with a specific number of words
  • Only cells containing an email address ending in .com/.org
  • Only cells with numbers

Want to use regular expressions in Excel without learning VBA?
!SEMTools lets you find, extract, and replace text patterns in just a couple of clicks!

This post is also available in RU.