perf(util): memoize parseUrl to avoid redundant REGEX_URL runs#246
perf(util): memoize parseUrl to avoid redundant REGEX_URL runs#246caffodian wants to merge 1 commit into
Conversation
parseUrl runs REGEX_URL, a large URL-parsing regex, and the same `src` is commonly parsed many times: image loaders (e.g. next-cloudinary's <CldImage> via next/image) call the loader once per srcset candidate width — typically 6–8x — on an identical src, on every render, and the same URLs recur across renders/requests. Under server-side-render load this made REGEX_URL the single largest JS CPU consumer we observed in production profiling. parseUrl is pure (src -> parts), so memoize it with a bounded (LRU, 5000-entry) Map keyed by src. Output and error behavior are unchanged: - each call returns a fresh object, including freshly-copied `transformations` and `queryParams`, so callers can mutate the result without affecting the cache (matches the previous "new object per call" contract); - invalid input throws exactly as before and failures are never cached; - a guarded undefined branch is never cached or copied. Collapses the per-render srcset duplicates to a single regex execution per distinct URL. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@caffodian is attempting to deploy a commit to the Cloudinary DevX Team on Vercel. A member of the Team first needs to authorize it. |
|
Disclosure: the PR description above is an AI-generated summary of an investigation, so please treat the specific figures as indicative rather than a rigorous upstream benchmark. Long story short: we applied a local patch for this and it seems promising. On pages with a lot of Cloudinary React components — especially under Next.js SSR — this one regex was responsible for a massive amount of CPU time. Memoizing it helped a lot in our testing. That said, I'm genuinely not sure whether this belongs here in 🤖 This comment was written by Claude Code on behalf of @caffodian. |
Description
parseUrlrunsREGEX_URL— a large URL-parsing regex — and the samesrcis commonly parsed many times:srcsetcandidate width (typically 6–8×) on an identicalsrc, on every render — e.g.next-cloudinary's<CldImage>rendering throughnext/image.getPublicId/getTransformationseach callparseUrlagain on the same URL too.Under server-side-render load this made
REGEX_URLthe single largest JS CPU consumer we observed in production profiling of an image-heavy page — it was being re-run on identical strings with no caching.parseUrlis pure (src→ parts), so this memoizes it with a bounded LRU (Mapkeyed bysrc, 5000-entry cap, least-recently-used eviction). On a representative image-heavy SSR workload this collapsed the per-rendersrcsetduplicates to a single regex execution per distinct URL — ~97% ofparseUrlCPU eliminated — with no change to output or error behavior.Behavior is preserved
transformationsandqueryParams, so callers can mutate the result without affecting the cache (matches the previous "new object per call" contract).undefinedbranch is never cached or copied.parseUrlchanges internally; the existing logic is preserved verbatim as a privateparseUrlUncached. Public API is unchanged.Tests: added a
parseUrl memoizationblock (cache hit serves repeats without re-parsing — verified via adecodeURIComponentspy seam; deep-copy independence so a mutated result can't poison the cache; failures re-throw and aren't cached). All existingparseUrltests pass unchanged.tsc,eslint,tsupbuild, andvitestall green locally.Issue Ticket Number
No existing issue — happy to open a tracking issue if you'd prefer. Filed as a performance fix surfaced by production profiling.
Type of change
Checklist
🤖 This PR was written by Claude Code on behalf of @caffodian.