Features of !SEMTools

SUBSTITUTE

Function categoryText
VolatilityNon-volatile
Similar functionsREPLACE

What does the SUBSTITUTE function do?

The SUBSTITUTE function allows replacing all occurrences or the Nth occurrence of a substring in text with another text. The substring can be either a single character or several.

The function is somewhat analogous to the Find and Replace tool in Excel, with some differences:

  • What is obvious and is the basic difference, it is a function, i.e., it recalculates automatically, unlike the one-time nature of procedures, and outputs the result in another cell, while the procedure changes the result in place;
  • The SUBSTITUTE function is always case-sensitive. In the “find and replace” procedure, case sensitivity is optional;
  • The function, oddly enough, does not support wildcard characters (“*” and “?”);
  • However, it allows performing mass replacement of several (up to 64 in the latest Excel version) values simultaneously;
  • Another advantage of the function is that it allows replacing not all occurrences, but a specific one by order in the string, which the procedure doesn’t allow.

SUBSTITUTE syntax

=SUBSTITUTE(Cell_or_text, Old_text, New_text, [Instance_num])

The last argument is optional (therefore indicated in square brackets in the example), if not specified, all occurrences will be replaced.

Formatting

Characteristic features of the SUBSTITUTE function:

  • An empty cell is treated as a zero-length string (“”).
  • Logical values are converted to text preserving case.
  • All variations of numeric formats (date, time, date-time, percentages, financial format) are converted to their corresponding natural number in text format, so to preserve the original formatting the TEXT function may be needed.

Formula examples with the SUBSTITUTE function

The SUBSTITUTE function is extremely popular and can be used:

  • for its direct purpose, when the necessary result is the string itself with replaced fragments
  • as an auxiliary function in complex formulas, when the result of calculations is not even a string, but a number or logical value

Let’s look at several examples of situations where I personally used the function.

SUBSTITUTE to count words

This is the case when the function is used not for its intended purpose, because the target value is a number, not text. The lifehack is as old as the world but doesn’t lose its relevance.

How many words are in a cell? One more than the spaces between them. This will always be true if only space is considered as the word separator, and if there are no more than 1 space between every two words. To protect yourself from extra spaces, you can use the TRIM function.

How to count spaces in a string? Like any character, you can count them by calculating the length of the string with them and without them, and finding the difference. How to get a string without them? Of course, using the SUBSTITUTE function.

This is what the formula for cell A1 will look like, taking into account all these nuances. LEN measures string length:

=LEN(TRIM(A1)) - LEN(SUBSTITUTE(TRIM(A1), " ", "")) + 1

The formula has a side effect – for an empty cell it returns 1, so if empty cells may be present, the formula above will need to be “wrapped” with an expression using the IF function:

=IF(A1 = "", 0, LEN(TRIM(A1)) - LEN(SUBSTITUTE(TRIM(A1), " ", "")) + 1)

Replace spaces with line breaks

Often there is a task to replace spaces in a string with line breaks, and vice versa. Here, the combination of SUBSTITUTE and CHAR functions will be useful. This is what the formula replacing all spaces in a string with line breaks will look like.

=SUBSTITUTE(A1, " ", CHAR(10))

As can be seen from the formula, the line break character code is 10.

Note that the result may display as a continuous line if the “Wrap Text” option is not enabled.

Line breaks in cells after replacement with SUBSTITUTE function
Display of line breaks in cells after replacement with SUBSTITUTE function

SUBSTITUTE to count occurrences in text

Finding out how many times a fragment appears in text can be done quite simply using a combination of SUBSTITUTE and LEN functions.

  1. The first (SUBSTITUTE) allows creating a string without the specified fragment, replacing it with emptiness.
  2. The second (LEN) – to measure the length of the result.
  3. If you subtract the resulting length from the original length, and divide by the length of this fragment, you can get its frequency.
  4. If it’s known in advance that it’s a single character, the division process can be omitted, since the result won’t change when divided by 1.

This is what the formula looks like that counts in cell A1 the number of occurrences of a 1-character fragment from cell B1:

=LEN(A1) - LEN(SUBSTITUTE(A1, B1, ""))

This – for counting the occurrence of a specific character in a cell, in this case – a forward slash, which allows, for example, counting the nesting level of a directory URL by its address:

=LEN(A1) - LEN(SUBSTITUTE(A1, "/", ""))

And this – for a fragment longer than 1 character, for example, if you need to count how many times a specific word appears in text:

=(LEN(A1) - LEN(SUBSTITUTE(A1, B1, ""))) / LEN(B1)

SUBSTITUTE for counting digits in text

Let’s imagine our task is to validate a list of phone numbers by the number of digits in them. We’ll consider a number with 11 digits as correct.

Meanwhile, the number can be specified in any format – with brackets, hyphens, plus at the beginning, with spaces and without. Besides the number itself, other notes may be present in the string, for example, contact person, whether the phone is mobile or not.

Obviously, the LEN function won’t work in this case, nor will it be possible to immediately remove all extra characters to leave only digits in the cells.

The combination of SUBSTITUTE and LEN functions in an array formula with the SUM function will solve this task:

=SUM(LEN(A1) - LEN(SUBSTITUTE(A1, {1,2,3,4,5,6,7,8,9,0}, "")))
Counting digits in strings with mixed content using SUBSTITUTE
Counting digits in strings with mixed content. Column B has formulas, Column C has their text.

Other examples

Remove digits from cells

Remove Latin characters from text

Add character to each word in cell

Remove last words from cells in Excel

Like the article? Help its author! Buy !SEMTools, it has lots of useful instruments to process text data.

This post is also available in RU.