Features of !SEMTools

N-gram analysis in Excel

For search advertising and SEO, n-gram analysis is one of the most effective approaches. For a long time, implementing reliable n-gram algorithms was practical only for large agencies with in-house developers or power users with programming skills.

How to build a frequency dictionary in Excel

To make this method accessible to anyone with Windows and Excel, n-gram tools are built into the !SEMTools for Excel add-in. Below are several analysis approaches with simple, schematic examples.

In all cases, results are written to a separate worksheet. Your source data is never altered.

Simple n-gram analysis (occurrence analysis)

This approach is the simplest: choose an n-gram size, and measure how often each n-gram appears in the text.

Word frequency: count unique words and list them with their occurrences

If you only need to check whether a word appears in any rows of an Excel range, a basic formula like COUNTIF works well. The examples below count rows that contain the sequence of characters “word”, ignoring case.

=COUNTIF(A1:A100, "*word*")

The asterisk wildcard means any characters may appear before or after the sequence. Because of this, “word” will also match “wording”, etc. If you need exact whole-word matches, one quick trick is to temporarily pad each cell with a leading and trailing space and search with spaces:

=COUNTIF(A1:A100, "* word *")

Keep in mind: these formulas count rows containing the word, not the total number of times the word appears within a single row. To count every occurrence across text, use the dedicated n-gram tools in !SEMTools, which handle per-cell frequencies correctly and output a complete frequency dictionary.

N-gram analysis in Excel with !SEMTools

If you prefer video instructions, a quick demo is available on our channels. Otherwise, read on for the text version.

Frequency dictionary of words (1-gram)

Select your text (typically many Excel cells). In the add-in menu, choose Semantic analysis → Build frequency dictionary → N-grams → Words (1-gram).
Whether you have 5 rows or 1,000,000 rows, the procedure is highly optimized and completes in seconds, even on modest PCs. The only practical limit is Excel’s row capacity when outputting unique words (no more than 1,048,575 unique words can be listed on a single sheet), which is rarely an issue in real projects.

1-gram frequency dictionary demo

If you want to ignore different word forms (e.g., “run”, “running”, “ran”), lemmatize the text first and then build a frequency dictionary of lemmas rather than surface forms.

Bigrams (2-grams) occurrence analysis

Same idea, but for bigrams — sequences of two words. Extending this to trigrams and higher n-grams follows the same pattern.

2-gram frequency demo

N-gram analysis weighted by impressions (or another metric)

When your dataset consists of phrases with a known metric per phrase (in search marketing, this is often impressions and clicks), you’ll get better signal by weighting n-gram counts by that metric. This reveals which word forms or lemmas matter most in real demand.

In !SEMTools, this is built in: simply select two columns — the text and the metric — and run the procedure. You can do the same for bigrams, trigrams, and beyond.

Weighted n-gram analysis demo

N-gram analysis across multiple metrics

This approach is especially useful for PPC specialists analyzing derived metrics such as CTR, CPC, CPA, CR, AOV, and ROAS. Since these depend on several base metrics, !SEMTools can compute n-gram splits that aggregate — and then derive — performance indicators by n-gram.

Rank single words by performance metrics
Ranking individual words by efficiency metrics.

This analysis can surface high-converting word combinations for focused bidding, as well as low-converting combinations to exclude—even when their individual words don’t look weak “on average.”

Derived metrics by n-gram across base metrics
Measure derived efficiency metrics for phrases based on underlying absolute metrics and their values.

Conclusion

The methods above work not only for search queries and keyword lists, but for any text you provide —regardless of length. For best results, remove extra spaces, convert all text to lowercase, and then run your analysis.

Run n-gram analysis, build weighted frequency dictionaries, and rank phrases by PPC metrics — all directly in Excel. Install !SEMTools to save time and focus on higher-impact work.

This post is also available in RU.