How to Categorize Text With Keywords in Excel

Learn multiple Excel methods to categorize text with keywords with step-by-step examples and practical applications.

excelformulaspreadsheettutorial
13 min read • Last updated: 7/2/2025

How to Categorize Text With Keywords in Excel

Why This Task Matters in Excel

In today’s data-driven organizations, large volumes of unstructured or semi-structured text pour in from multiple sources: customer support tickets, product reviews, social-media mentions, research notes, marketing campaign logs, and even internal project descriptions. Decision-makers rarely have the time to read every single line. Instead, they need high-level insights such as “Which department should handle this ticket?”, “Is the review positive or negative?”, or “Which product line does this feedback reference?” That is exactly where keyword-based text categorization becomes mission-critical.

Imagine a customer-support manager who receives hundreds of ticket descriptions each day. By scanning descriptions for keywords like “refund”, “technical error”, or “login”, Excel can automatically route tickets to Finance, Engineering, or IT—reducing response time and improving customer satisfaction. In marketing analytics, you might classify social-media posts with keywords like “feature request”, “pricing complaint”, or “shipping delay” to identify trending issues. Human-resource teams commonly tag employee comments with categories such as “benefits”, “work-life”, or “leadership”.

Excel is uniquely positioned for this kind of quick classification for several reasons:

  1. Accessibility: Almost every business user already has Excel installed and knows the basics of filters and formulas.
  2. Flexibility: You can build anything from a one-off ad-hoc sheet that classifies ten entries to an enterprise-wide model that evaluates millions of rows via Power Query or dynamic array formulas.
  3. Integration: Excel connects natively to Outlook, SharePoint, and Power BI, letting you feed categorized text directly into dashboards or automated workflows.
  4. Transparency: Unlike opaque machine-learning models, a keyword rule built in Excel is auditable. Stakeholders can see exactly why a specific piece of text was flagged.

Failing to master this task leads to tedious manual work, inconsistent decision-making, and missed trends because valuable signals remain buried in text. Becoming proficient at keyword categorization also bolsters related Excel competencies: logical functions, text manipulation, dynamic arrays, lookup strategies, and data-quality controls. In short, it is a foundational skill that underpins broader analytics and reporting workflows.

Best Excel Approach

The most broadly compatible and transparent method is to store your category-keyword rules in a lookup table and use a dynamic array formula that searches for each keyword inside the text. The approach leverages three powerful functions:

  1. SEARCH (or FIND for case sensitivity) – locates a substring anywhere in a cell.
  2. XLOOKUP (or INDEX/MATCH in older Excel) – retrieves the category that corresponds to the first keyword match.
  3. LET – simplifies the formula by defining intermediate variables, improving readability and performance.

The logic:

  • Evaluate every keyword against the target text.
  • Identify the first keyword that appears (or all keywords, depending on requirement).
  • Return the category linked to that keyword.

You will need two named ranges:

  • [Keywords] – the list of keyword strings.
  • [Categories] – the parallel list of corresponding category labels.

Recommended dynamic-array formula (Office 365 or Excel 2021):

=LET(
     txt, A2,
     hit, IF(ISNUMBER(SEARCH(TRANSPOSE(Keywords), txt)), Keywords, ""),
     first, INDEX(hit, MATCH(TRUE, hit<>"", 0)),
     XLOOKUP(first, Keywords, Categories, "Uncategorized")
)

Alternative for users without LET or dynamic arrays (Excel 2016 and earlier) involves an array-entered INDEX/MATCH:

=IFERROR(
   INDEX(Categories, MATCH(TRUE, ISNUMBER(SEARCH(Keywords, A2)), 0)),
   "Uncategorized"
)

(Confirm with Ctrl + Shift + Enter.)

When to choose each:

  • Dynamic-array + LET — modern Excel, cleaner logic, spills automatically.
  • Array-entered INDEX/MATCH — backward compatibility.
  • Power Query or VBA — massive datasets, complex logic, or scheduled automation.

Prerequisites: ensure your keyword table has no blanks, and put the most specific keywords earlier if you care about precedence. Persist your categorization rules on a separate sheet for easy maintenance.

Parameters and Inputs

Before jumping into formulas, prepare your components carefully.

Target Text (Required)

  • Data type: string, often stored in column A.
  • Must be plain text; strip line breaks or non-printing characters where possible.
  • Use TRIM and CLEAN if necessary.

Keywords (Required)

  • Data type: string list in a single column, e.g., [D2:D20].
  • Avoid duplicates; each keyword should map to only one category unless you plan on multi-category tagging.

Categories (Required)

  • Text labels in a parallel list, e.g., [E2:E20].
  • Ensure the order mirrors Keywords exactly.

Optional Inputs

  • “Default Category” overriding the “Uncategorized” literal. This can be a cell reference for easy change.
  • Case sensitivity: choose SEARCH (case-insensitive) or FIND (case-sensitive).
  • Multiple keyword hits: decide whether to return the first, last, or all matches concatenated with TEXTJOIN.
  • Stop words: you may want to ignore trivial keywords like “the”, “and”, etc.

Validation Rules

  • Confirm every keyword appears only once to prevent ambiguous mapping.
  • Check Categories column for blanks that could return empty results.
  • Ensure Target Text column is not formatted as numbers or dates.

Edge Cases

  • Overlapping keywords (e.g., “ship” vs “shipping”) can trigger unintended matches. Use word boundaries or order your list from longest to shortest.
  • Very long text (over 32,767 characters) will truncate in Excel; split into multiple cells or use Power Query.

Step-by-Step Examples

Example 1: Basic Scenario

You run a small online store and collect customer feedback in column A:

[A2] “I want a refund, the shirt ripped after one wash”
[A3] “Excellent quality and fast shipping”
[A4] “Cannot log in, password reset not working”

Your keyword table lives on Sheet “Rules”:

[D2] refund | [E2] Finance
[D3] shipping | [E3] Logistics
[D4] login | [E4] Support
[D5] password | [E5] Support

Step 1 – Name the ranges

  • Select [D2:D5] and create the name Keywords.
  • Select [E2:E5] and name it Categories.

Step 2 – Enter the formula
In B2, enter:

=LET(
     txt, A2,
     hit, IF(ISNUMBER(SEARCH(TRANSPOSE(Keywords), txt)), Keywords, ""),
     first, INDEX(hit, MATCH(TRUE, hit<>"", 0)),
     XLOOKUP(first, Keywords, Categories, "Uncategorized")
)

The formula spills down automatically, returning:

B2 Finance
B3 Logistics
B4 Support

Why it works

  • SEARCH returns a number when it finds a keyword.
  • ISNUMBER converts that to TRUE/FALSE.
  • IF returns the keyword text where TRUE.
  • INDEX/MATCH picks the first non-blank.
  • XLOOKUP translates the keyword into a category.

Common Variations

  • Swap SEARCH with FIND if you need case sensitivity.
  • Append a DEFAULT category cell reference in place of “Uncategorized” for easy updates.

Troubleshooting

  • If every result is “Uncategorized”, confirm the named ranges are spelled correctly and that your keywords have no leading/trailing spaces.
  • Use LEN(Keyword) to sort from longest to shortest to prevent “ship” matching “shipping”.

Example 2: Real-World Application

A global electronics company tracks 50,000 Amazon reviews in a worksheet. They need to classify each review into “Battery”, “Screen”, “Performance”, or “Customer Service” based on multiple language variants of keywords because reviews come in English, Spanish, French, and German.

Data Setup

  • Reviews in [A2:A50001].
  • Multilingual keyword table on sheet “Dictionary”:

Column D (Keywords) | Column E (Category)
batería | Battery
batterie | Battery
battery | Battery
pantalla | Screen
screen | Screen
écran | Screen
slow | Performance
lento | Performance
lent | Performance
warranty | Customer Service
garantie | Customer Service

Walkthrough

  1. Name [Dictionary!D2:D100] as Keywords and [Dictionary!E2:E100] as Categories.
  2. Add a named constant DefaultCat = \"Other\".
  3. Use the LET formula in column B:
=LET(
     txt, A2,
     kw, Keywords,
     cat, Categories,
     hits, IF(ISNUMBER(SEARCH(TRANSPOSE(kw), txt)), kw, ""),
     firstHit, INDEX(hits, MATCH(TRUE, hits<>"", 0)),
     XLOOKUP(firstHit, kw, cat, _DefaultCat_)
)
  1. Copy the formula down (or use dynamic spill if supported).

Business Impact

  • Within minutes, 50k reviews are tagged and ready for pivot-table analysis showing that Battery issues dominate in Germany, while Performance complaints spike in Spanish markets.
  • Management prioritizes a firmware update accordingly.

Integration

  • Push the results into Power Pivot to build Power BI dashboards.
  • Use conditional formatting to highlight “Customer Service” reviews for immediate attention.

Performance Considerations

  • On 50k rows x 100 keywords, calculation time remains acceptable in modern Excel thanks to vectorized SEARCH calls.
  • If performance degrades, limit the keyword range with dynamic arrays that automatically shrink to the last keyword entered or offload to Power Query.

Example 3: Advanced Technique

Scenario: A fintech startup receives chat transcripts containing multi-sentence messages. They need multi-label classification: a single message can belong to several categories (e.g., “Payment Failure” and “Password Reset”). They also want to maintain a log of which specific keywords matched.

Data Setup

  • Transcripts in [A2:A20000].
  • Keywords and Categories in [Keywords] and [Categories].

Step 1 – Return all matching keywords with TEXTJOIN

=LET(
     txt, A2,
     kw, Keywords,
     matches, IF(ISNUMBER(SEARCH(TRANSPOSE(kw), txt)), kw, ""),
     TEXTJOIN(", ", TRUE, FILTER(matches, matches<>""))
)

This provides a comma-separated list of keywords: “payment,error” for example.

Step 2 – Map keywords to categories and return unique list

=LET(
     txt, A2,
     kw, Keywords,
     cat, Categories,
     kMatch, FILTER(kw, ISNUMBER(SEARCH(TRANSPOSE(kw), txt))),
     cMatch, UNIQUE(FILTER(cat, ISNUMBER(SEARCH(TRANSPOSE(kw), txt)))),
     TEXTJOIN(", ", TRUE, cMatch)
)

Step 3 – Capture both keyword and category audit trail

=LET(
     txt, A2,
     kw, Keywords,
     cat, Categories,
     idx, ISNUMBER(SEARCH(TRANSPOSE(kw), txt)),
     kList, FILTER(kw, idx),
     cList, FILTER(cat, idx),
     "Keywords: " & TEXTJOIN(", ", TRUE, kList) & CHAR(10) &
     "Categories: " & TEXTJOIN(", ", TRUE, UNIQUE(cList))
)

(Enable wrap text to see line breaks.)

Performance Optimization

  • SEARCH repeated twice could be expensive. Store idx in a variable (as above) and reuse.
  • If working with 100,000 transcripts, process chunks or use Power Query’s Text.ContainsAny function with the keyword list.

Error Handling

  • If no keywords match, FILTER will spill a #CALC! error. Wrap the FILTER call in IFERROR to return “Uncategorized”.
  • Long TEXTJOIN results may exceed 32k character cell limit; write to separate rows or trimmed output.

Professional Tips

  • Maintain the keyword table as an Excel Table object so dynamic range names expand automatically.
  • Use a column for “Active/Inactive” flag in the keyword table and FILTER(Keywords, Active=\"Y\") to turn rules on or off without deleting rows.
  • Record a macro to refresh classification after importing new transcript files.

Tips and Best Practices

  1. Store the keyword and category table on its own sheet, convert it to an Excel Table, and name the columns; this keeps formulas dynamic as rules grow.
  2. Order keywords from most specific to most generic when you are returning only the first hit. That prevents a generic word like “error” from capturing “payment error” entries that should map to Payment.
  3. Add helper columns such as “Last Updated”, “Priority”, or “Language” in the rule table for richer analytics later.
  4. Use LET to reduce duplicate SEARCH calls; every redundant call on 100,000 rows can noticeably slow a workbook.
  5. Combine formula-based categorization with conditional formatting to visually flag high-priority categories, improving team responsiveness.
  6. Document the sheet: add comments or a legend explaining how to maintain the keyword list so future users do not break the logic.

Common Mistakes to Avoid

  1. Duplicate Keywords: Having the same keyword linked to multiple categories causes ambiguous returns. Always deduplicate and decide precedence.
  • Fix by using COUNTIF on the Keyword column; investigate any count above one.
  1. Out-of-Sync Lists: If Categories list becomes shorter than Keywords, INDEX/MATCH returns incorrect labels.
  • Always resize both columns together or convert to a Table so they stay aligned.
  1. Missing Word Boundaries: The keyword “ship” matches “worship”.
  • Enclose keywords in wildcard spaces like \" ship \" or use REGEX if available.
  1. Array Formula Not Entered Correctly (legacy Excel): Forgetting Ctrl + Shift + Enter causes only the first row to calculate.
  • Confirm by looking for curly-brace indicators in the formula bar after entry.
  1. Case Sensitivity Confusion: Using FIND when you expected SEARCH (or vice versa) leads to missed matches.
  • Decide your policy early, document it, and stick to one function consistently.

Alternative Methods

Although dynamic array formulas are the most straightforward, several other options may serve better in specific contexts.

MethodProsConsBest For
Formula with LET + SEARCHNo add-ins, transparent, dynamicNeeds modern Excel; formula complexity grows with requirementsDay-to-day tagging up to a few hundred thousand rows
Power Query Text.ContainsAnyHandles millions of rows, refreshable, no formulas visible to breakRequires query knowledge; slightly less interactiveScheduled ETL, data lakes, corporate BI pipelines
VBA User-Defined FunctionUnlimited flexibility, can use Regular ExpressionsRequires macro-enabled file, security prompts, maintenance burdenSpecialized rules needing word boundaries or context windows
Flash FillFast one-off manual pattern extractionNot rules-based, can be inconsistentAd-hoc classifications on small lists
Office Script / Power AutomateAutomates end-to-end cloud workflowsNeeds Microsoft 365 E3/E5 or similar licenseIntegrating Excel classification into web services or email routing

When to switch: If your sheet size crosses one million rows or rules require context beyond simple keyword search (like negation or proximity), consider Power Query or VBA. For heavily audited environments where macros are banned, stick to formula or Power Query only.

FAQ

When should I use this approach?

Use keyword categorization when you have unstructured text that maps neatly to predefined buckets, such as routing support tickets, summarizing survey free-text, or tagging compliance notes. It excels when the categories are clear and new keywords do not appear every day.

Can this work across multiple sheets?

Yes. Store your keyword table on a separate Config sheet and name the ranges. The formula referencing those names will work no matter which sheet the target text lives on. For multi-sheet results, convert your formula to an Excel Table and reference structured names so it spills correctly.

What are the limitations?

Formulas struggle with negation (e.g., “not working”) and phrases requiring context. They also cannot handle more characters than Excel’s cell limit of 32,767. Power Query or VBA with Regular Expressions is required for sophisticated logic like word boundaries or phrase distance.

How do I handle errors?

Wrap the main formula in IFERROR, returning \"Uncategorized\" or a blank. Use LEN and CLEAN on incoming text to avoid non-printing characters that break SEARCH. For legacy array formulas, remember to recalc with Ctrl + Shift + Enter after editing.

Does this work in older Excel versions?

Array-entered INDEX/MATCH with SEARCH functions back to Excel 2007. However, LET, FILTER, UNIQUE, TEXTJOIN, and dynamic spill require 365/2021. Adapt by replacing these with helper columns or VBA if stuck on older versions.

What about performance with large datasets?

Performance hinges on the number of keyword-text combinations evaluated. Reduce the keyword list with filters (e.g., by language), process data in batches, or move heavy lifting to Power Query. In modern Excel, vectorized SEARCH inside LET often outperforms repeated row-by-row UDFs.

Conclusion

Categorizing text with keywords in Excel transforms chaotic narrative data into actionable insights. By learning how to structure a clean keyword table, apply dynamic LET formulas, and manage edge cases, you unlock faster ticket routing, sharper sentiment analysis, and richer BI reporting—all without leaving the familiar spreadsheet environment. Mastering this technique not only saves hours of manual tagging but also strengthens your command over logical, text, and lookup functions. Keep experimenting with dynamic arrays, Power Query, and automation tools to extend these skills into even larger and more complex datasets. Your future analytics projects will thank you.

We use tracking cookies to understand how you use the product and help us improve it. Please accept cookies to help us improve.