Why storefront pages fetch the CMS before commerce, how caching makes that cheap, and when it's worth reordering the calls.
Remember waterfalls?
You ask for one thing. Then, only once that comes back, you ask for the next thing. Then the next. Each request politely waits in line for the one ahead of it, even when it has no reason to.
On a single page, in a single component tree, waterfalls are easy to create by accident. You await something, and everything below that line now lives after it in time — whether it needed to or not.
This post is about one specific waterfall. The one that shows up on almost every storefront built on a federation layer like the Alokai Middleware. It's small. It's defensible. And it's worth understanding precisely, because the moment one assumption changes, it stops being small.
Let's look at a product listing page.
Here's the shape of a category page. I'll strip it down to the part that matters:
export default connectCmsPage(async function CategoryPage(props) {
const { categoryData, productCatalog } = await buildCategoryPageData(props);
return <CategoryGrid category={categoryData} products={productCatalog} />;
});
Two things are happening here, and they're easy to read right past.
The first is connectCmsPage. It's a higher-order component — a wrapper. Before your CategoryPage body ever runs, the wrapper awaits the CMS. It fetches the page definition (getPage, or getPersonalizedPage when personalization is on), maps the locale, sets up preview-mode rerendering, wraps everything in the live-preview shell. All of that cross-cutting machinery lives in one place, and every page type — category, product, home, a plain CMS page — gets it for free by wrapping.
The second is buildCategoryPageData. Inside it, the commerce data is fetched in parallel, the way you'd want:
const [categoryData, productCatalog] = await Promise.all([
getCategory(categoryId),
sdk.unified.searchProducts(searchProductsQuery),
]);
So getCategory and searchProducts race each other. Good. Nobody's waiting on nobody.
Except — look at where the two pieces sit relative to each other. The CMS fetch is in the wrapper. The commerce fetch is in the body. And the body is the wrapper's child. A child can't start until its parent has resolved.
Promise.all is nested under the awaited CMS call — so it can't start earlySo even though the product search doesn't need a single byte from the CMS, it can't begin until the CMS call is done. The products wait on the CMS — not because they depend on it, but because of where they live in the tree.
It would be tempting to call this a bug. It isn't. It's a tradeoff someone made on purpose, and to know whether it's the right one, you have to know how much that first hop actually costs.
Here's the thing about the CMS call: content doesn't change very often. A category's merchandised layout — the hero, the banners, the promo blocks — gets edited by a marketer maybe a few times a week. Which means getPage is almost always a cache hit. Something close and warm. Single-digit milliseconds.
So picture the timeline with real numbers. CMS is a cache hit at ~10ms. The product search is the slow one, because it's a real query against a search engine — call it ~200ms.
I said getPage is "almost always a cache hit" and waved at "something warm." Let me not wave. Which cache, and why it's warm, is the whole ballgame — because personalization is about to switch some of these off, and you want to know exactly which lights go dark.
There isn't one cache in front of the CMS call. There are four, stacked, each catching what the one above it missed:
cache() — one render. Just stops the same render from fetching the same thing twice.And there's a layer you don't even run: the CMS's own delivery API is already a CDN. Contentful's Content Delivery API is, in their words, "available via a globally distributed content delivery network (CDN)" that purges on publish; Contentstack serves its CDA from edge caches too, purging "only the changed content" when you publish. So for published pages, that "origin" at the bottom of the ladder is rarely a cold read — it's another edge hit, and it's very likely the real reason getPage is cheap in this demo even with no Redis installed. The documented exceptions line up exactly with the slow case: preview (a separate, non-CDN endpoint) and per-user content (nothing shared to cache) don't get that benefit.
The layer that makes "~10ms getPage" cheap on your own infrastructure — once you outgrow the provider's CDN, or go per-user — is Layer 3, Redis, and it's the one people forget. Alokai's Redis integration is an SDK module — sdk.redis.getOrSetCache(key, fetcher, { tags }). Unlike the edge caches, it's explicit (you wrap the call) and tag-based (you invalidate by meaning, not by clock). Wrap the CMS read in it and the first render populates Redis; every later render — across every SSR instance — reads from Redis until a CMS publish webhook invalidates the page: tag. You drop exactly the page that changed, the instant it changes.
Which is exactly the setup for what comes next. Because the moment you personalize, you start switching these layers off, from the top down.
Now turn on personalization. But "personalization" isn't one thing — and the difference is the whole story for caching.
This isn't a hand-wave — it's how the products actually work. Contentful's edge rendering says the list of assigned Experiences "can serve as a cache key so that subsequent visitors can be performantly served the same combination of Experience content." Contentstack likewise "caches the personalized web page against the request URL and the applied variants." The key is the segment, not the person — so a thousand shoppers in the same audience share one cached page.
So the real lever isn't "personalized or not." It's the cardinality of the cache key. Anonymous is one key. Segmented is N keys — fine while N stays small. Per-user is roughly one key per visitor, and that is the only case where the cache genuinely can't help. (Watch for combinatorial blow-up, though: stack enough audiences × geo × A/B tests and N quietly explodes, thinning the traffic per variant until the hit rate sags back toward the 1:1 cost.)
The expensive case below is that last one — true 1:1. When the page is a function of this user's identity, cart, and history, there's nothing shared to cache, so getPersonalizedPage becomes an origin round-trip with a session attached. Call it ~180ms, and it sits fully in front of the commerce calls.
Same code. Same tree. Same waterfall. But the cost went from ~10ms to ~180ms, because the assumption underneath it — "the CMS call is basically free" — quietly stopped being true. The page didn't change. The cache behavior did.
The per-user page can't sit on a shared edge — layer 1, dark. The per-user data can't either — layer 2, dark. And Redis can't share a per-user entry; at best it caches per user, a sliver of the hit rate — layer 3, mostly dark. You fall all the way through to the origin. That isn't a metaphor; it's the literal mechanism.
And there's a sharper point hiding here, one that should make you more cautious about parallelizing, not less. On a personalized page you may not even know you're fetching the right products yet. In B2B, contract pricing, or customer-group catalogs, the commerce call carried with a session can return different prices — and a different set of visible products — per shopper. If the personalized CMS is what decides the merchandising, then firing searchProducts in parallel risks fetching the wrong catalog, not merely a catalog you discard. Now you're not paying for wasted work, you're paying for incorrect work, plus the refetch — which spends the exact latency you parallelized to save. (In the code we looked at, the product query is built from props, not the CMS payload, so this only bites once personalization actually reshapes the product context. But the moment it does, it's the best argument on the page for staying CMS-first.)
The federation layer's whole job is to sit between the storefront and a pile of backends — CMS here, commerce there, search somewhere else — and decide how to talk to all of them. Which means it's exactly the layer that could fetch them together. If the three independent calls ran in one batch — by hoisting the commerce promises above the CMS await, or by adding a middleware endpoint that orchestrates all three server-side — you'd get this:
So why isn't it just always done this way? Because "for free" is doing a lot of work in that sentence. It's free in latency. It is not free in everything else.
Here's the whole trade laid out, because the latency win is real but it is not the only column.
| Dimension | Parallelizing wins | Parallelizing costs |
|---|---|---|
| TTFB, cached CMS | ~10ms saved — i.e. nothing | Complexity for no real payoff |
| TTFB, uncached / personalized | The whole CMS hop (~180ms) | — |
| Preview / personalization machinery | — | connectCmsPage centralizes preview rerender, the personalized→default fallback, locale mapping. Parallelizing means duplicating that per route or threading un-awaited promises through the wrapper |
| Consistency | — | Every page fetches CMS the same way today. The PLP would now diverge from the pure CMS page — which must stay CMS-first, because its SKUs come out of the CMS payload (a real dependency) |
| CMS short-circuit | — | Today, CMS can return notFound, or a fully custom override, and you skip the commerce fetch. Fetch in parallel and you've already fired searchProducts for a page you might not render |
| Wasted backend load | — | On that 404 / override, the product search ran for nothing |
| Correctness under personalization | — | If pricing or catalog visibility depends on the session/CMS context, a parallel searchProducts can fetch the wrong products, forcing a refetch that erases the latency win |
| Cache granularity (merged-endpoint variant) | One page, one cache key | The merged response inherits the shortest TTL — volatile price/stock — so you lose the cheap, independent, long-lived CMS cache (e.g. a tag-invalidated Redis entry that outlives the commerce data) |
| Failure isolation | — | CMS failure and commerce failure are handled separately today. Merge them and you need explicit partial-failure handling or it's all-or-nothing |
| Reusability | Fix it once in the shared wrapper, every page benefits | Only PLP-type pages qualify; the pure CMS page can't be parallelized at all |
A design that's correct under an assumption is only as correct as the assumption. The CMS-first ordering is right because CMS reads are cheap and cacheable — and we now know "cheap" means a warm, shared, tag-invalidated cache is sitting behind the federation layer. So the thing to actually do is not "parallelize everything" and not "leave it alone forever." It's: measure the assumption.
A cold or absent Redis layer will make the CMS hop look expensive no matter how you order the calls — and so will a page that's truly per-user when it didn't need to be. Check the cache-key cardinality before you blame the waterfall: if the page is segmented, it's cacheable and the hop should be cheap; if it's genuinely 1:1, no ordering trick brings the shared caches back. Only once the cache is warm, the page is as shared as it can be, and the hop is still slow does reordering start to pay. Watch getPage / getPersonalizedPage latency at p75 and p95, and let the numbers tell you which world you're in.
Most cache problems come down to something small — a single header or key that quietly flips a shareable response into a per-user one. Here's what keeps the layers warm, and what silently turns them off.
| ✅ Do | ❌ Don't | Why it matters |
|---|---|---|
Set cookies on a separate, uncached request, and strip Set-Cookie from pages you want shared-cached. |
Return Set-Cookie on a page you expect the CDN to cache. |
Most CDNs treat Set-Cookie as a cache-killer and return a BYPASS on every request. |
Keep Vary minimal (e.g. Accept-Encoding); use Cache-Control: private for genuinely personal pages. |
Add Vary: Cookie (or Vary: Authorization). |
Every unique cookie value becomes its own cache entry — a well-known hit-rate killer. |
Key shared pages on the segment / variant combo, and route authenticated traffic to no-store (the demo already does this for /cart, /checkout, /my-account). |
Put session ids or auth tokens into the URL or cache key. | Responses carrying Authorization must not be stored in a shared cache; per-user keys collapse the hit rate to ~1 entry per visitor. |
Strip or normalize non-semantic params (utm_*, fbclid, gclid) and round volatile inputs before they reach the key. |
Let unbounded query strings flow into the cache key untouched. | Each unique string is a separate entry. Contentful notes even exact geo coordinates "can't take advantage of our caching layer" — round them to 2–3 decimals. |
Invalidate by tag / surrogate key — Redis getOrSetCache({ tags }) plus the CMS's purge-on-publish. |
Rely on short TTLs alone to paper over staleness. | Tag purge drops exactly what changed; Contentful and Contentstack purge only the changed entry on publish, leaving the rest warm. |
Add stale-while-revalidate (and stale-if-error) so refreshes happen in the background. |
Ship hot pages as max-age=0, must-revalidate with no SWR. |
Otherwise every expiry blocks a real user on the origin instead of serving slightly-stale instantly. |
| Prefer bounded segments / "experiences" so the page stays shared-cacheable, and keep the variant space small. | Make a page truly 1:1 unless it has to be — or let audiences × geo × tests multiply unchecked. | Segment variants share one cached entry (key = variant combo); true per-user, or a combinatorial blow-up of variants, forfeits the shared layers. |
| Serve shoppers from the delivery API (CDN-backed); keep preview for editors only. | Route live shopper traffic through the Preview API. | Preview is a separate, non-CDN endpoint (preview.contentful.com) and won't cache — it's for drafts, not scale. |
Verify real cache status — X-Cache: HIT/MISS, Age — at p95 in production. |
Assume "it's cached" because you set a header. | An upstream cookie or Vary can silently bypass the cache; CDNs expose HIT/MISS headers precisely so you can confirm. |
Tie the cache-busting id to the release (GIT_SHA) so deploys rotate keys cleanly. |
Reuse a static busting id across deploys. | A stale id serves yesterday's federated data after a release; a per-release id invalidates it the moment you ship. |
Vary — sneaking per-user identity into a key that should have been shared.Some commerce data should never sit in a shared cache — not because of latency, but because caching it is wrong: it serves one shopper another's data, oversells stock you don't have, or quotes a price the customer was never entitled to. Alokai's own guidance splits the offenders into two buckets — time-sensitive and session-specific data — and both are commerce-specific.
| ❌ Never shared-cache | Commerce risk if you do | ✅ Do instead |
|---|---|---|
| Cart contents | The classic leak: shopper B is served shopper A's cached cart, or a stale total. A privacy breach, not just a bug. | private, no-store; fetch the cart client-side after load. |
| Checkout & payment pages | Contain addresses, payment details, and user-specific tax/shipping. Caching them risks exposing data and breaking PCI compliance. | private, no-store — always origin. (The demo marks /checkout exactly this way.) |
| Account, order history, wishlist | Pure per-user PII; one cache key per shopper at best, cross-user leakage at worst. | no-store; render from the session on the client. |
| Live stock / inventory | Over-cache and you show "in stock" on a sold-out item — overselling, cancellations, angry customers. | Very short TTL, or fetch fresh at the PDP/cart; don't bake stock into long-cached HTML. |
| Customer-group / contract (B2B) pricing & promotions | The wrong shopper sees a price or discount they were never entitled to — a margin and, sometimes, legal problem. | Cache the shared shell with base price; resolve the entitled price per-segment key or client-side. |
| Personalized recommendations / recently viewed | Per-user by definition; a shared cache serves the last visitor's "Products you may like." | Defer to a client-side fetch with a skeleton to hold the layout. |
And here's the part that's really about federation, not just caching: the cleanest rule isn't to maintain a per-endpoint blocklist — it's to not federate session- or time-sensitive data into the cached page at all. Keep the SSR payload identical for every shopper, and pull the personal, volatile bits in a second step on the client (Alokai's docs show exactly this — a useEffect fetch behind a skeleton). The shared shell caches beautifully; the per-user fragment never touches the cache. The same logic kills the tempting "one merged endpoint for the whole page" idea: fuse volatile price/stock onto stable content and the combined entry inherits the shortest TTL, so you've cached the slow-changing content for as long as the fastest-changing field allows — which is to say, barely.
Built on the Alokai Middleware. The numbers here are illustrative — yours are in your traces. Go look at them.