How to Get Page From Url in Excel

Learn multiple Excel methods to get page from url with step-by-step examples and practical applications.

excelformulaspreadsheettutorial
12 min read • Last updated: 7/2/2025

How to Get Page From Url in Excel

Why This Task Matters in Excel

A modern spreadsheet is far more than a static table of numbers. Product, marketing, and data teams routinely pull website analytics, SEO reports, e-commerce feeds, or cloud-based logs straight into Excel. Almost every one of those datasets contains a full web address in at least one column. If your analysis only needs the host or the top-level path, carrying around the entire string is wasteful and slows you down. More importantly, most business questions revolve around pages, not whole URLs:

  • Marketing wants to know which landing page generated the most leads.
  • E-commerce managers compare revenue by product page.
  • UX analysts test page performance to drive site optimizations.
  • Support teams look for the knowledge-base article that has the highest exit rate.

All of these scenarios require splitting each URL into parts so the page (often the final segment of the path) can be grouped, counted, or compared. Without the ability to isolate the page quickly, every pivot table, chart, or summarizing formula becomes cluttered, forcing extra manual cleanup that introduces errors.

Because Excel has powerful text and data-transformation features, it can do this parsing without external scripts or paid add-ins. Functions such as TEXTAFTER, TEXTBEFORE, FIND, and MID, combined with dynamic array behavior, give you one-cell solutions that automatically spill results for thousands of rows. Even if you are on an older version of Excel lacking the newest text functions, you can still get the job done with classic techniques like RIGHT and SUBSTITUTE, Flash Fill, or Power Query. Mastering the “get page from URL” task strengthens your general ability to manipulate strings, one of the most transferable skills in Excel.

Failing to learn this skill has tangible consequences. Colleagues will copy-paste raw URLs into separate columns, slowing refresh times and making models fragile. Dashboards will mis-label pages, under-report conversions, or duplicate metrics. Cross-functional data governance breaks down when every team slices URLs differently. By investing a few minutes to understand best-practice approaches, you safeguard data integrity and become faster at every report that depends on web addresses.

Best Excel Approach

The optimal, future-proof approach uses the new TEXTAFTER and TEXTBEFORE functions introduced with dynamic arrays (Microsoft 365 and Excel 2021). They are readable, short, and robust against common URL variations such as trailing slashes, query strings, or fragments.

Recommended logic:

  1. Strip off any parameters or fragments that start with the question mark (?) or pound sign (#).
  2. From the remaining text, return everything after the last forward slash (/).
  3. Optionally remove an extension like “.html” or “.php” if you need only the slug.

The single-cell, spill-friendly formula looks like this:

=LET(
    url,A2,
    clean, TEXTBEFORE(url,"?"),                      
    page, TEXTAFTER(clean,"/",-1),
    TEXTBEFORE(page,".")
)

Explanation of parameters

  • url – full URL string coming from [A2].
  • clean – string up to, but not including, any query parameters.
  • page – last segment after the final slash.
  • Final TEXTBEFORE step drops an optional dot extension. If there is no dot, the function simply returns the whole segment.

Alternative for users without TEXTAFTER / TEXTBEFORE:

=RIGHT(SUBSTITUTE(LEFT(A2,FIND("?",A2&"?")-1),"/",REPT(" ",99)),99)

That substitute-right trick mimics “last item after slash” by padding with spaces.

When to choose which method

  • Use TEXTAFTER / TEXTBEFORE if you have Microsoft 365 or Excel 2021 because the formula is shorter, faster, and self-documenting.
  • Choose the SUBSTITUTE+RIGHT technique on earlier Excel versions, but be aware it is harder to debug.
  • Opt for Power Query if you must repeatedly refresh very large datasets and need a point-and-click UI.

Parameters and Inputs

  1. Source URL column
  • Data type: Text. Excel treats everything between the protocol prefix (http:// or https://) and the last character as a string.
  • Length: Up to 32,767 characters (Excel cell limit), but practical URLs are far shorter.
  1. Optional delimiter removal
  • Characters like “?” or “#” mark the start of query strings or fragments. Decide whether to preserve or strip them.
  • Path delimiter “/” is mandatory and cannot be changed inside the formula.
  1. Extension removal
  • TEXTBEFORE(page,\".\") assumes a dot separates the slug from its extension.
  • If your URLs lack extensions (common with CMS-generated pages), you can omit this step.
  1. Validation rules
  • URL must contain at least one slash. If not, the formula returns the entire string as the page.
  • Blank cells propagate a blank result, ensuring spill ranges line up.
  1. Edge-case handling
  • Trailing slash “/” after the page: TEXTAFTER cleanly ignores it because a final empty segment is valid but not desired.
  • Multiple dots inside the slug (e.g., “my.page.name.html”): TEXTBEFORE keeps only “my”, so adjust logic if you need the full slug except the last extension (use TEXTBEFORE(page,\".\",-1)).

Step-by-Step Examples

Example 1: Basic Scenario

Imagine a small marketing list containing five landing pages copied from Google Analytics:

Step-by-step:

  1. Enter the list in cells [A2:A6].
  2. In [B2], type the recommended formula exactly:
=LET(
    url,A2,
    clean, TEXTBEFORE(url,"?"),
    page, TEXTAFTER(clean,"/",-1),
    TEXTBEFORE(page,".")
)
  1. Confirm with Enter. Excel spills the result into [B2:B6]:
  • apply
  • how-to-cook
  • product-123
  • about
  • contact

Why it works

  • TEXTBEFORE(url,\"?\") removes parameters like “?ref=ad” and “?utm=footer”.
  • TEXTAFTER(clean,\"/\",-1) grabs the segment after the last slash, ignoring earlier folder levels.
  • TEXTBEFORE(page,\".\") deletes “.html” while leaving extension-less slugs intact (“how-to-cook”).

Common variations

  • If some URLs do not contain a question mark, TEXTBEFORE returns the whole text, so the logic still holds.
  • To keep the extension, remove the final TEXTBEFORE.
  • Add UPPER() or PROPER() around the last step to normalize text for presentation.

Troubleshooting tips

  • #VALUE! appears if the cell is truly empty. Wrap url in IF(A\2=\"\",\"\",...) to suppress errors.
  • Result is blank for “https://example.com/” because the page segment is empty. Provide default text with IF(result=\"\",\"home page\",result).

Example 2: Real-World Application

Scenario: An e-commerce analyst imports a CSV containing 20,000 checkout sessions. Column [D] holds full URLs that include category and product page information:

https://shop.online-store.com/collections/sale/hooded-sweatshirt-blue?variant=12345&utm_source=summer-campaign

Goal: Build a pivot table that aggregates revenue by product slug (“hooded-sweatshirt-blue”) to discover which garments sell the most during a promotion.

Data setup

  • Session sheet: [A] session_id, [B] customer_id, [C] revenue, [D] page_url.
  • Sheet contains mixed domain variants (http and https) and occasional trailing slashes.

Walkthrough

  1. Insert a new column [E] called page_slug.
  2. Because the workbook runs on Microsoft 365, paste this dynamic array formula in [E2]:
=LET(
    url,D2:D20001,                                         
    clean, TEXTBEFORE(url,"?"),
    slug, TEXTAFTER(clean,"/",-1),
    TEXTBEFORE(slug,".")                                    
)
  1. Press Ctrl+Shift+Enter not required – Excel spills results instantly for all 20,000 rows.
  2. Create a pivot table from the data range, selecting page_slug as Rows and SUM of revenue as Values.
  3. Sort descending to reveal top-grossing products.

Business impact

  • Manual text-to-columns step eliminated, enabling a single refresh button for future CSVs.
  • Analyst delivers weekly promotion reports twenty minutes faster.
  • Data team uses the same formula logic in Power Query to maintain consistency between BI dashboards and the Excel file.

Integration points

  • Conditional formatting highlights slugs with revenue below target thresholds.
  • A slicer tied to a campaign field allows stakeholders to filter results without touching formulas.

Performance considerations

  • TEXTAFTER and TEXTBEFORE operate natively in C-engine and handle 20k rows blazingly fast.
  • Keep formulas in a single spill range to avoid recalculating multiple duplicates across columns.

Example 3: Advanced Technique

Edge case: URLs from a content management system append language codes and timestamped cache-busters, for example:

https://portal.global.com/en-us/news/earnings-call.asp?v=20230915T2310Z#section2

Additional requirement: Return the page path from the language segment onward, not just the final slug, producing “news/earnings-call.asp”.

Advanced dynamic-array solution:

=LET(
    url,A2,
    clean,  TEXTBEFORE(url,"?"),
    lang,   TEXTAFTER(clean,"/",3),       // skip protocol (1), domain (2), language (3)
    TEXTAFTER(lang,"/",1)                 // remove empty first slash produced by step above
)

Explanation

  • The nested TEXTAFTER with the optional instance argument (3) jumps over the first three slashes – covering protocol delimiter and domain.
  • A second TEXTAFTER removes the leading slash, yielding “news/earnings-call.asp”.

Professional tips

  • Wrap LANGUAGE extraction in IFERROR to fall back gracefully when a URL lacks a language segment.
  • Cache-buster variations using ampersand (&v=) are dropped by the earlier TEXTBEFORE(clean,\"?\") step.
  • For huge log files, offload to Power Query: split by delimiter “/” with “Skip first [3] columns” option, then merge remaining columns with a custom formula.

Optimization

  • Store recurring logic in the Name Manager as a reusable LAMBDA function GETSLUG(). Your formula becomes simply `=GETSLUG(`A2), making sheets more readable.

Tips and Best Practices

  1. Use LET to name sub-steps – Improves readability and performance by calculating each intermediate string once rather than multiple times.
  2. Normalize URLs before parsing – Force lowercase, remove “www.”, or standardize trailing slashes with SUBSTITUTE to avoid duplicate slug variants.
  3. Combine with dynamic arrays – Reference an entire column (e.g., D2:D) in the formula to auto-extend as rows grow, eliminating the need to copy formulas down.
  4. Turn parsing into a custom function – With LAMBDA, create GET_PAGE(URL) so non-technical teammates can use it without seeing the internal formula.
  5. Document assumptions – Add a comment or a note next to the formula explaining why extensions are stripped or parameters ignored. Future maintainers will thank you.
  6. Test edge cases – Paste unusual URLs such as those lacking extensions, having trailing slashes, or containing double slashes to verify robust results.

Common Mistakes to Avoid

  1. Forgetting to strip query strings – Pages like “/landing?utm=ad” and “/landing” will be treated as two different slugs, fragmenting metrics. Always cut everything after the question mark when your analysis does not need parameters.
  2. Double-calculating inside SUBSTITUTE chains – Repeating FIND inside multiple nested MID or RIGHT calls slows large sheets. Name the FIND result once with LET.
  3. Ignoring trailing slashes – “/about/” versus “/about” look identical to humans but not to Excel. Use TEXTAFTER(clean,\"/\",-1) instead of a hard-coded position to handle both cases.
  4. Assuming every URL has an extension – Many modern CMSs omit “.html”. TEXTBEFORE(page,\".\") will return a blank slug in that situation. Protect with IF(page=\"\", TEXTAFTER(clean,\"/\",-1), TEXTBEFORE(page,\".\")).
  5. Overusing volatile functions – INDIRECT or OFFSET might tempt you for dynamic ranges but recalculate constantly. Prefer structured tables and spill ranges for better performance.

Alternative Methods

Below is a comparison of other viable approaches:

MethodProsConsBest For
Power Query split columnsPoint-and-click, transforms huge datasets efficiently, refreshableRequires loading to Data Model or table, extra interface learningScheduled report refreshes, millions of rows
Flash FillOne-click pattern recognition, no formulas visibleStatic; must redo when source data updates, limited to obvious patternsOne-off cleaning tasks on small sets
VBA custom functionWorks in all Excel versions, fully customizableRequires macro-enabled workbook, potential security promptsLegacy environments, advanced edge cases
Classic MID/FIND/RIGHT formulaCompatible with Excel 2010-2019Longer, harder to read, less flexibleUsers on perpetual licenses lacking TEXTAFTER

When to switch methods

  • Move to Power Query if you exceed roughly 100k rows or need multiple consecutive text transformations.
  • Use Flash Fill only if the list is static or will not refresh from external links.
  • VBA is a last resort for very complex parsing not covered by built-in functions or when you need to apply business rules like ignoring numeric suffixes.

FAQ

When should I use this approach?

Use it any time your analysis relies on grouping or filtering by page rather than full URL. Typical triggers include web analytics exports, marketing UTM reports, or server log reviews.

Can this work across multiple sheets?

Yes. Reference the source sheet explicitly, e.g., =GETSLUG('Raw Logs'!A2:A5000). The spill result will appear on the current sheet, allowing you to centralize parsing in one place.

What are the limitations?

TEXTAFTER and TEXTBEFORE require Microsoft 365 or Excel 2021. Older versions need alternative formulas or Power Query. Also, these text functions treat URLs purely as strings; they do not validate protocol syntax or punycode.

How do I handle errors?

Wrap the entire formula in IFERROR, providing a default like “invalid URL” so that #VALUE! messages do not break downstream pivot tables. Example: =IFERROR(GETSLUG(A2),"invalid URL").

Does this work in older Excel versions?

Yes, but you must replace TEXTAFTER and TEXTBEFORE with combinations of SUBSTITUTE, FIND, and RIGHT. Performance will be slower, and formulas harder to maintain.

What about performance with large datasets?

The dynamic-array functions are highly optimized. For very large logs (hundreds of thousands of rows), switch calculation to manual or migrate the heavy lifting to Power Query or the Data Model. Avoid volatile functions and keep all parsing in one spill range to reduce recalculation overhead.

Conclusion

Extracting the page from a URL is a deceptively small task that unlocks enormous analytical power. Whether you rely on modern dynamic arrays or classic text functions, mastering this technique lets you clean data faster, build more accurate reports, and collaborate consistently across teams. Continue exploring related skills—such as splitting domains, isolating query parameters, and normalizing protocols—to deepen your command of Excel’s text toolkit. With clean page slugs at your fingertips, your dashboards will be clearer, your insights sharper, and your workflows leaner.

We use tracking cookies to understand how you use the product and help us improve it. Please accept cookies to help us improve.