perf(util): memoize parseUrl to avoid redundant REGEX_URL runs by caffodian · Pull Request #246 · cloudinary-community/cloudinary-util

caffodian · 2026-06-05T17:30:55Z

Description

parseUrl runs REGEX_URL — a large URL-parsing regex — and the same src is commonly parsed many times:

Image loaders call the loader once per srcset candidate width (typically 6–8×) on an identical src, on every render — e.g. next-cloudinary's <CldImage> rendering through next/image. getPublicId / getTransformations each call parseUrl again on the same URL too.
The same image URLs recur across renders and across requests.

Under server-side-render load this made REGEX_URL the single largest JS CPU consumer we observed in production profiling of an image-heavy page — it was being re-run on identical strings with no caching.

parseUrl is pure (src → parts), so this memoizes it with a bounded LRU (Map keyed by src, 5000-entry cap, least-recently-used eviction). On a representative image-heavy SSR workload this collapsed the per-render srcset duplicates to a single regex execution per distinct URL — ~97% of parseUrl CPU eliminated — with no change to output or error behavior.

Behavior is preserved

Each call returns a fresh object, including freshly-copied transformations and queryParams, so callers can mutate the result without affecting the cache (matches the previous "new object per call" contract).
Invalid input throws exactly as before, and failures are never cached.
A guarded undefined branch is never cached or copied.
Only parseUrl changes internally; the existing logic is preserved verbatim as a private parseUrlUncached. Public API is unchanged.

Tests: added a parseUrl memoization block (cache hit serves repeats without re-parsing — verified via a decodeURIComponent spy seam; deep-copy independence so a mutated result can't poison the cache; failures re-throw and aren't cached). All existing parseUrl tests pass unchanged. tsc, eslint, tsup build, and vitest all green locally.

Issue Ticket Number

No existing issue — happy to open a tracking issue if you'd prefer. Filed as a performance fix surfaced by production profiling.

Type of change

Bug fix (non-breaking change which fixes an issue) — specifically a performance fix; no API or behavior change
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Fix or improve the documentation
This change requires a documentation update

Checklist

I have followed the contributing guidelines of this project (conventional-commit message for semantic-release)
I have created an issue ticket for this PR (see note above — happy to file one)
I have checked to ensure there aren't other open Pull Requests for the same update/change
I have performed a self-review of my own code
I have run tests locally to ensure they all pass
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes needed to the documentation (none needed — no API change)

🤖 This PR was written by Claude Code on behalf of @caffodian.

parseUrl runs REGEX_URL, a large URL-parsing regex, and the same `src` is commonly parsed many times: image loaders (e.g. next-cloudinary's <CldImage> via next/image) call the loader once per srcset candidate width — typically 6–8x — on an identical src, on every render, and the same URLs recur across renders/requests. Under server-side-render load this made REGEX_URL the single largest JS CPU consumer we observed in production profiling. parseUrl is pure (src -> parts), so memoize it with a bounded (LRU, 5000-entry) Map keyed by src. Output and error behavior are unchanged: - each call returns a fresh object, including freshly-copied `transformations` and `queryParams`, so callers can mutate the result without affecting the cache (matches the previous "new object per call" contract); - invalid input throws exactly as before and failures are never cached; - a guarded undefined branch is never cached or copied. Collapses the per-render srcset duplicates to a single regex execution per distinct URL. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vercel · 2026-06-05T17:30:59Z

@caffodian is attempting to deploy a commit to the Cloudinary DevX Team on Vercel.

A member of the Team first needs to authorize it.

caffodian · 2026-06-05T17:37:06Z

Disclosure: the PR description above is an AI-generated summary of an investigation, so please treat the specific figures as indicative rather than a rigorous upstream benchmark.

Long story short: we applied a local patch for this and it seems promising. On pages with a lot of Cloudinary React components — especially under Next.js SSR — this one regex was responsible for a massive amount of CPU time. Memoizing it helped a lot in our testing.

That said, I'm genuinely not sure whether this belongs here in @cloudinary-util/util vs. branching the logic higher up in the React/Next component (or the loader) layer, where the per-srcset-width duplication actually originates. Happy to move it if you'd prefer that shape — wanted to open the discussion with a working, tested version of the in-library approach.

🤖 This comment was written by Claude Code on behalf of @caffodian.

caffodian marked this pull request as draft June 5, 2026 17:33

caffodian marked this pull request as ready for review June 5, 2026 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(util): memoize parseUrl to avoid redundant REGEX_URL runs#246

perf(util): memoize parseUrl to avoid redundant REGEX_URL runs#246
caffodian wants to merge 1 commit into
cloudinary-community:mainfrom
caffodian:perf/memoize-parseurl

caffodian commented Jun 5, 2026

Uh oh!

vercel Bot commented Jun 5, 2026

Uh oh!

caffodian commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caffodian commented Jun 5, 2026

Description

Behavior is preserved

Issue Ticket Number

Type of change

Checklist

Uh oh!

vercel Bot commented Jun 5, 2026

Uh oh!

caffodian commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant