How to Categorize Text With Keywords in Excel
Learn multiple Excel methods to categorize text with keywords with step-by-step examples and practical applications.
How to Categorize Text With Keywords in Excel
Why This Task Matters in Excel
In today’s data-driven organizations, large volumes of unstructured or semi-structured text pour in from multiple sources: customer support tickets, product reviews, social-media mentions, research notes, marketing campaign logs, and even internal project descriptions. Decision-makers rarely have the time to read every single line. Instead, they need high-level insights such as “Which department should handle this ticket?”, “Is the review positive or negative?”, or “Which product line does this feedback reference?” That is exactly where keyword-based text categorization becomes mission-critical.
Imagine a customer-support manager who receives hundreds of ticket descriptions each day. By scanning descriptions for keywords like “refund”, “technical error”, or “login”, Excel can automatically route tickets to Finance, Engineering, or IT—reducing response time and improving customer satisfaction. In marketing analytics, you might classify social-media posts with keywords like “feature request”, “pricing complaint”, or “shipping delay” to identify trending issues. Human-resource teams commonly tag employee comments with categories such as “benefits”, “work-life”, or “leadership”.
Excel is uniquely positioned for this kind of quick classification for several reasons:
- Accessibility: Almost every business user already has Excel installed and knows the basics of filters and formulas.
- Flexibility: You can build anything from a one-off ad-hoc sheet that classifies ten entries to an enterprise-wide model that evaluates millions of rows via Power Query or dynamic array formulas.
- Integration: Excel connects natively to Outlook, SharePoint, and Power BI, letting you feed categorized text directly into dashboards or automated workflows.
- Transparency: Unlike opaque machine-learning models, a keyword rule built in Excel is auditable. Stakeholders can see exactly why a specific piece of text was flagged.
Failing to master this task leads to tedious manual work, inconsistent decision-making, and missed trends because valuable signals remain buried in text. Becoming proficient at keyword categorization also bolsters related Excel competencies: logical functions, text manipulation, dynamic arrays, lookup strategies, and data-quality controls. In short, it is a foundational skill that underpins broader analytics and reporting workflows.
Best Excel Approach
The most broadly compatible and transparent method is to store your category-keyword rules in a lookup table and use a dynamic array formula that searches for each keyword inside the text. The approach leverages three powerful functions:
- SEARCH (or FIND for case sensitivity) – locates a substring anywhere in a cell.
- XLOOKUP (or INDEX/MATCH in older Excel) – retrieves the category that corresponds to the first keyword match.
- LET – simplifies the formula by defining intermediate variables, improving readability and performance.
The logic:
- Evaluate every keyword against the target text.
- Identify the first keyword that appears (or all keywords, depending on requirement).
- Return the category linked to that keyword.
You will need two named ranges:
- [Keywords] – the list of keyword strings.
- [Categories] – the parallel list of corresponding category labels.
Recommended dynamic-array formula (Office 365 or Excel 2021):
=LET(
txt, A2,
hit, IF(ISNUMBER(SEARCH(TRANSPOSE(Keywords), txt)), Keywords, ""),
first, INDEX(hit, MATCH(TRUE, hit<>"", 0)),
XLOOKUP(first, Keywords, Categories, "Uncategorized")
)
Alternative for users without LET or dynamic arrays (Excel 2016 and earlier) involves an array-entered INDEX/MATCH:
=IFERROR(
INDEX(Categories, MATCH(TRUE, ISNUMBER(SEARCH(Keywords, A2)), 0)),
"Uncategorized"
)
(Confirm with Ctrl + Shift + Enter.)
When to choose each:
- Dynamic-array + LET — modern Excel, cleaner logic, spills automatically.
- Array-entered INDEX/MATCH — backward compatibility.
- Power Query or VBA — massive datasets, complex logic, or scheduled automation.
Prerequisites: ensure your keyword table has no blanks, and put the most specific keywords earlier if you care about precedence. Persist your categorization rules on a separate sheet for easy maintenance.
Parameters and Inputs
Before jumping into formulas, prepare your components carefully.
Target Text (Required)
- Data type: string, often stored in column A.
- Must be plain text; strip line breaks or non-printing characters where possible.
- Use TRIM and CLEAN if necessary.
Keywords (Required)
- Data type: string list in a single column, e.g., [D2:D20].
- Avoid duplicates; each keyword should map to only one category unless you plan on multi-category tagging.
Categories (Required)
- Text labels in a parallel list, e.g., [E2:E20].
- Ensure the order mirrors Keywords exactly.
Optional Inputs
- “Default Category” overriding the “Uncategorized” literal. This can be a cell reference for easy change.
- Case sensitivity: choose SEARCH (case-insensitive) or FIND (case-sensitive).
- Multiple keyword hits: decide whether to return the first, last, or all matches concatenated with TEXTJOIN.
- Stop words: you may want to ignore trivial keywords like “the”, “and”, etc.
Validation Rules
- Confirm every keyword appears only once to prevent ambiguous mapping.
- Check Categories column for blanks that could return empty results.
- Ensure Target Text column is not formatted as numbers or dates.
Edge Cases
- Overlapping keywords (e.g., “ship” vs “shipping”) can trigger unintended matches. Use word boundaries or order your list from longest to shortest.
- Very long text (over 32,767 characters) will truncate in Excel; split into multiple cells or use Power Query.
Step-by-Step Examples
Example 1: Basic Scenario
You run a small online store and collect customer feedback in column A:
[A2] “I want a refund, the shirt ripped after one wash”
[A3] “Excellent quality and fast shipping”
[A4] “Cannot log in, password reset not working”
Your keyword table lives on Sheet “Rules”:
[D2] refund | [E2] Finance
[D3] shipping | [E3] Logistics
[D4] login | [E4] Support
[D5] password | [E5] Support
Step 1 – Name the ranges
- Select [D2:D5] and create the name Keywords.
- Select [E2:E5] and name it Categories.
Step 2 – Enter the formula
In B2, enter:
=LET(
txt, A2,
hit, IF(ISNUMBER(SEARCH(TRANSPOSE(Keywords), txt)), Keywords, ""),
first, INDEX(hit, MATCH(TRUE, hit<>"", 0)),
XLOOKUP(first, Keywords, Categories, "Uncategorized")
)
The formula spills down automatically, returning:
B2 Finance
B3 Logistics
B4 Support
Why it works
- SEARCH returns a number when it finds a keyword.
- ISNUMBER converts that to TRUE/FALSE.
- IF returns the keyword text where TRUE.
- INDEX/MATCH picks the first non-blank.
- XLOOKUP translates the keyword into a category.
Common Variations
- Swap SEARCH with FIND if you need case sensitivity.
- Append a DEFAULT category cell reference in place of “Uncategorized” for easy updates.
Troubleshooting
- If every result is “Uncategorized”, confirm the named ranges are spelled correctly and that your keywords have no leading/trailing spaces.
- Use LEN(Keyword) to sort from longest to shortest to prevent “ship” matching “shipping”.
Example 2: Real-World Application
A global electronics company tracks 50,000 Amazon reviews in a worksheet. They need to classify each review into “Battery”, “Screen”, “Performance”, or “Customer Service” based on multiple language variants of keywords because reviews come in English, Spanish, French, and German.
Data Setup
- Reviews in [A2:A50001].
- Multilingual keyword table on sheet “Dictionary”:
Column D (Keywords) | Column E (Category)
batería | Battery
batterie | Battery
battery | Battery
pantalla | Screen
screen | Screen
écran | Screen
slow | Performance
lento | Performance
lent | Performance
warranty | Customer Service
garantie | Customer Service
Walkthrough
- Name [Dictionary!D2:D100] as Keywords and [Dictionary!E2:E100] as Categories.
- Add a named constant DefaultCat = \"Other\".
- Use the LET formula in column B:
=LET(
txt, A2,
kw, Keywords,
cat, Categories,
hits, IF(ISNUMBER(SEARCH(TRANSPOSE(kw), txt)), kw, ""),
firstHit, INDEX(hits, MATCH(TRUE, hits<>"", 0)),
XLOOKUP(firstHit, kw, cat, _DefaultCat_)
)
- Copy the formula down (or use dynamic spill if supported).
Business Impact
- Within minutes, 50k reviews are tagged and ready for pivot-table analysis showing that Battery issues dominate in Germany, while Performance complaints spike in Spanish markets.
- Management prioritizes a firmware update accordingly.
Integration
- Push the results into Power Pivot to build Power BI dashboards.
- Use conditional formatting to highlight “Customer Service” reviews for immediate attention.
Performance Considerations
- On 50k rows x 100 keywords, calculation time remains acceptable in modern Excel thanks to vectorized SEARCH calls.
- If performance degrades, limit the keyword range with dynamic arrays that automatically shrink to the last keyword entered or offload to Power Query.
Example 3: Advanced Technique
Scenario: A fintech startup receives chat transcripts containing multi-sentence messages. They need multi-label classification: a single message can belong to several categories (e.g., “Payment Failure” and “Password Reset”). They also want to maintain a log of which specific keywords matched.
Data Setup
- Transcripts in [A2:A20000].
- Keywords and Categories in [Keywords] and [Categories].
Step 1 – Return all matching keywords with TEXTJOIN
=LET(
txt, A2,
kw, Keywords,
matches, IF(ISNUMBER(SEARCH(TRANSPOSE(kw), txt)), kw, ""),
TEXTJOIN(", ", TRUE, FILTER(matches, matches<>""))
)
This provides a comma-separated list of keywords: “payment,error” for example.
Step 2 – Map keywords to categories and return unique list
=LET(
txt, A2,
kw, Keywords,
cat, Categories,
kMatch, FILTER(kw, ISNUMBER(SEARCH(TRANSPOSE(kw), txt))),
cMatch, UNIQUE(FILTER(cat, ISNUMBER(SEARCH(TRANSPOSE(kw), txt)))),
TEXTJOIN(", ", TRUE, cMatch)
)
Step 3 – Capture both keyword and category audit trail
=LET(
txt, A2,
kw, Keywords,
cat, Categories,
idx, ISNUMBER(SEARCH(TRANSPOSE(kw), txt)),
kList, FILTER(kw, idx),
cList, FILTER(cat, idx),
"Keywords: " & TEXTJOIN(", ", TRUE, kList) & CHAR(10) &
"Categories: " & TEXTJOIN(", ", TRUE, UNIQUE(cList))
)
(Enable wrap text to see line breaks.)
Performance Optimization
- SEARCH repeated twice could be expensive. Store idx in a variable (as above) and reuse.
- If working with 100,000 transcripts, process chunks or use Power Query’s Text.ContainsAny function with the keyword list.
Error Handling
- If no keywords match, FILTER will spill a #CALC! error. Wrap the FILTER call in IFERROR to return “Uncategorized”.
- Long TEXTJOIN results may exceed 32k character cell limit; write to separate rows or trimmed output.
Professional Tips
- Maintain the keyword table as an Excel Table object so dynamic range names expand automatically.
- Use a column for “Active/Inactive” flag in the keyword table and FILTER(Keywords, Active=\"Y\") to turn rules on or off without deleting rows.
- Record a macro to refresh classification after importing new transcript files.
Tips and Best Practices
- Store the keyword and category table on its own sheet, convert it to an Excel Table, and name the columns; this keeps formulas dynamic as rules grow.
- Order keywords from most specific to most generic when you are returning only the first hit. That prevents a generic word like “error” from capturing “payment error” entries that should map to Payment.
- Add helper columns such as “Last Updated”, “Priority”, or “Language” in the rule table for richer analytics later.
- Use LET to reduce duplicate SEARCH calls; every redundant call on 100,000 rows can noticeably slow a workbook.
- Combine formula-based categorization with conditional formatting to visually flag high-priority categories, improving team responsiveness.
- Document the sheet: add comments or a legend explaining how to maintain the keyword list so future users do not break the logic.
Common Mistakes to Avoid
- Duplicate Keywords: Having the same keyword linked to multiple categories causes ambiguous returns. Always deduplicate and decide precedence.
- Fix by using COUNTIF on the Keyword column; investigate any count above one.
- Out-of-Sync Lists: If Categories list becomes shorter than Keywords, INDEX/MATCH returns incorrect labels.
- Always resize both columns together or convert to a Table so they stay aligned.
- Missing Word Boundaries: The keyword “ship” matches “worship”.
- Enclose keywords in wildcard spaces like \" ship \" or use REGEX if available.
- Array Formula Not Entered Correctly (legacy Excel): Forgetting Ctrl + Shift + Enter causes only the first row to calculate.
- Confirm by looking for curly-brace indicators in the formula bar after entry.
- Case Sensitivity Confusion: Using FIND when you expected SEARCH (or vice versa) leads to missed matches.
- Decide your policy early, document it, and stick to one function consistently.
Alternative Methods
Although dynamic array formulas are the most straightforward, several other options may serve better in specific contexts.
| Method | Pros | Cons | Best For |
|---|---|---|---|
| Formula with LET + SEARCH | No add-ins, transparent, dynamic | Needs modern Excel; formula complexity grows with requirements | Day-to-day tagging up to a few hundred thousand rows |
| Power Query Text.ContainsAny | Handles millions of rows, refreshable, no formulas visible to break | Requires query knowledge; slightly less interactive | Scheduled ETL, data lakes, corporate BI pipelines |
| VBA User-Defined Function | Unlimited flexibility, can use Regular Expressions | Requires macro-enabled file, security prompts, maintenance burden | Specialized rules needing word boundaries or context windows |
| Flash Fill | Fast one-off manual pattern extraction | Not rules-based, can be inconsistent | Ad-hoc classifications on small lists |
| Office Script / Power Automate | Automates end-to-end cloud workflows | Needs Microsoft 365 E3/E5 or similar license | Integrating Excel classification into web services or email routing |
When to switch: If your sheet size crosses one million rows or rules require context beyond simple keyword search (like negation or proximity), consider Power Query or VBA. For heavily audited environments where macros are banned, stick to formula or Power Query only.
FAQ
When should I use this approach?
Use keyword categorization when you have unstructured text that maps neatly to predefined buckets, such as routing support tickets, summarizing survey free-text, or tagging compliance notes. It excels when the categories are clear and new keywords do not appear every day.
Can this work across multiple sheets?
Yes. Store your keyword table on a separate Config sheet and name the ranges. The formula referencing those names will work no matter which sheet the target text lives on. For multi-sheet results, convert your formula to an Excel Table and reference structured names so it spills correctly.
What are the limitations?
Formulas struggle with negation (e.g., “not working”) and phrases requiring context. They also cannot handle more characters than Excel’s cell limit of 32,767. Power Query or VBA with Regular Expressions is required for sophisticated logic like word boundaries or phrase distance.
How do I handle errors?
Wrap the main formula in IFERROR, returning \"Uncategorized\" or a blank. Use LEN and CLEAN on incoming text to avoid non-printing characters that break SEARCH. For legacy array formulas, remember to recalc with Ctrl + Shift + Enter after editing.
Does this work in older Excel versions?
Array-entered INDEX/MATCH with SEARCH functions back to Excel 2007. However, LET, FILTER, UNIQUE, TEXTJOIN, and dynamic spill require 365/2021. Adapt by replacing these with helper columns or VBA if stuck on older versions.
What about performance with large datasets?
Performance hinges on the number of keyword-text combinations evaluated. Reduce the keyword list with filters (e.g., by language), process data in batches, or move heavy lifting to Power Query. In modern Excel, vectorized SEARCH inside LET often outperforms repeated row-by-row UDFs.
Conclusion
Categorizing text with keywords in Excel transforms chaotic narrative data into actionable insights. By learning how to structure a clean keyword table, apply dynamic LET formulas, and manage edge cases, you unlock faster ticket routing, sharper sentiment analysis, and richer BI reporting—all without leaving the familiar spreadsheet environment. Mastering this technique not only saves hours of manual tagging but also strengthens your command over logical, text, and lookup functions. Keep experimenting with dynamic arrays, Power Query, and automation tools to extend these skills into even larger and more complex datasets. Your future analytics projects will thank you.
Related Articles
How to Show the 10 Most Common Text Values in Excel
Learn multiple Excel methods to list the 10 most frequent text values—complete with step-by-step examples, business use cases, and expert tips.
How to Abbreviate Names Or Words in Excel
Learn multiple Excel methods to abbreviate names or words with step-by-step examples and practical applications.
How to Abbreviate State Names in Excel
Learn multiple Excel methods to abbreviate state names with step-by-step examples, professional tips, and real-world applications.