How We Built a Zero-API Online Visit for Super Bowl Traffic

13 min read

Written by: 

Andy Dybionka

Updated:  Apr 29, 2026

Here's what we'll cover

Here's what we'll cover

This is Part 2 of our Super Bowl scaling series. If you haven't read Part 1: Where do you start when you need to scale everything?, start there for context on our scaling strategy and the "Async All The Things" philosophy that drove this work.


When I first heard about the “No-V” idea, it sounded crazy.

Our entire online visit (OV) application was designed around an authenticated user created early in the flow. Every piece of information was submitted to the server the moment a user answered a question. Some steps performed multiple API calls - eligibility checks, saving profile information, fetching calculated prices, performing phone verification, and more. The whole system assumed it could talk to the backend freely, on every single interaction.

And now someone was suggesting we build a version that talks to the backend... almost never. Not the OV… the “No-V”. 

Given a strict and immovable deadline - Super Bowl 2026 - we decided to start with one condition and potentially expand to others later. We also wanted the ability to toggle No-V on and off via feature flags, and we wanted to ship incrementally to the main branch rather than dropping one massive pull request when we were ready to flip the switch.

These constraints shaped three core challenges:

  1. Two online visit modes had to coexist on the same branch

  2. No-V work couldn't break any existing functionality or distract other teams

  3. Other developers had to keep shipping features for the regular visit without caring about No-V

The solution we landed on is an overlay system. It doesn't touch the existing internals much, but fundamentally changes how data is fetched, saved, and processed - all transparently. Thanks to this approach, we built the entire No-V system with near-zero production incidents and almost no disruption to other teams' workflows. They didn't have to care about how we were rebuilding the data flow, because we hadn't changed theirs. They were able to operate as usual.

In this article, I'll describe how we achieved all of this - how we delivered a working No-V application before the Super Bowl that worked like a charm, protected all the relevant data and performed in a way so smooth that developers not involved in this project could barely tell anything had changed.

The Low-Hanging Fruit: API Snapshots

Before tackling the hard problem of intercepting requests, we started with an easier one: many of the API calls the application makes don't return personalized data at all. They return configuration - supported regions per condition, product catalogs, pricing structures, "how did you hear about us" choices, requirement configs. This data is the same for every user and changes infrequently.

In the old world, every single user who started an online visit would trigger these same GET requests, and the server would return the same JSON every time. Multiply that by 7,000 visit starts in one minute during the Super Bowl, and you've got tens of thousands of redundant calls to endpoints returning identical data.

The solution was straightforward: fetch this data once, commit it as static JSON files, and load them from the bundle instead of the network. We built three packages to support this:

  • A prebuild CLI tool - fetches real API responses from our staging and production environments and writes them to disk as JSON files

  • A snapshots package - stores the generated snapshot files and configuration for which endpoints to snapshot

  • Runtime loaders - functions that load snapshot data instead of making API calls (e.g., loadSupportedRegionInfoFromSnapshots(conditionId, region))

The configuration defines which endpoints to snapshot and where to store the results:

// Per-condition resources (generated for each active condition)
resources: [
  { urlPath: `/api/public/conditions/${id}/supported-regions`,
  outFilePath: `conditions/${id}/supported-regions.json` },
  { urlPath: `/api/public/conditions/${id}/offerings`,
  outFilePath: `conditions/${id}/offerings.json` },
  { urlPath: `/api/public/conditions/${id}/products`,
    outFilePath: `conditions/${id}/products.json` },
  // ... more per-condition endpoints
]
// Plus global resources
resources: [
  { urlPath: '/api/public/medication-tags',
    outFilePath: 'medication-tags.json' },
  // ...
]

Importantly, this wasn't a No-V-only optimization. We replaced these API calls for the entire application - all conditions, not just the one we were launching with. Every user benefits from loading bundled data instead of waiting for a network round-trip.

To keep the snapshots fresh, we set up a GitHub Actions workflow that runs daily on a cron schedule. It calls every configured endpoint and regenerates the snapshot files. If git diff detects any changes, the workflow commits the updated files to a new branch and opens a PR. The workflow runs daily, but because these configurations change so infrequently, we do not have to push application updates very often. 

Before building out any other architecture, for a single user session, this change already eliminated around 7-10 API calls. At peak Super Bowl traffic, that is tens of thousands fewer API calls per minute that never need to hit the server at all.

Intercepting What's Left

API snapshots handled the static, configuration-like endpoints elegantly. But what about the rest? What about POST, PATCH, and DELETE requests that write data? What about endpoints that return personalized information - user profiles, questionnaire state, payment details? What about all the operations where the server doesn't just return data, but processes logic and creates records?

These couldn't be replaced with static JSON files. We needed a different approach.

When we started discussing the frontend approach, we considered several ideas.

Option 1: Redux action replay. The idea was to rebuild the entire application to rely fully on Redux actions - simplify the existing ones, type them properly, so the final payload could be composed from the action history and session resume could be done by replaying actions. But the amount of work and unclear benefits terrified nearly every frontend developer on the project. It would have meant rewriting core parts of the application, with no guarantee the result would be simpler.

Option 2: Inline conditionals everywhere. The alternative was to go through every place the regular visit makes an API call and add a conditional: if No-V is enabled, write to in-memory storage instead of calling the API. The problem: No-V-specific logic would be scattered across dozens of components and modules, requiring deep prop drilling in some cases. This would lead to merge conflicts, spaghetti code, and - most critically - a high risk of breaking production code by touching it directly. On top of that, our application powers not just one condition but many others that weren't in scope for No-V. In a codebase this large, with shared components serving multiple conditions, reliably finding every place that needed modification would have been a nightmare.

Option 3: Service Worker with MSW. This initial idea was simpler in concept: add an artificial server-like layer that intercepts all requests, captures outbound data, stores it in memory, and returns fake responses in a backward-compatible shape. The standard code wouldn't even know it's running in No-V mode - exactly like how we mock API requests in tests. One option for this was to use MSW (Mock Service Worker). Its API is clean, and it's HTTP-client agnostic. We even built a proof of concept. But there was a nagging problem: service workers have their own lifecycle, managed by the browser independently from the application. The worker script needs to be registered, it can be cached separately from the app bundle, and after a new deployment the old worker stays active until all tabs are closed. Even with mitigations like skipWaiting(), there's a window where the service worker shim and the application code can be out of sync. The risk of stale code was not worth pushing this concept forward in the end. 

Option 4: Custom Axios Adapter. This was a variation of the service worker idea: intercept all requests and handle them with custom logic, but at the Axios level instead of the service worker level. Since our entire application already uses Axios for HTTP requests, we could replace its adapter - the component responsible for actually sending requests over the network - with our own. Our custom adapter would match incoming requests against registered handlers, and each handler could decide whether to fully handle the request on the client side, pass it through to the real server, or do a hybrid of both. This is what we built.

Why We Chose the Custom Axios Adapter

The custom Axios adapter gave us the same interception model as the service worker approach - catch requests, handle them locally, return fake responses - but without the caching risk. All our API calls already go through a shared Axios instance, which is an integrated part of the application, built and deployed together with the rest of the code. There's no separate worker with its own lifecycle that the browser might cache. When we deploy a new version, the adapter code updates along with everything else.

For readers who aren't deep in the JavaScript ecosystem: Axios is one of the most popular HTTP client libraries for JavaScript. When your frontend code needs to talk to a server, you typically use Axios to make requests like GET /api/users or POST /api/orders.

Under the hood, Axios has a concept called an adapter. The adapter is the lowest-level component that actually sends the request - whether that's using the browser's XMLHttpRequest, the fetch API, or Node.js's http module. Axios doesn't care which one it uses; the adapter abstracts that away.

Here's the key insight: you can replace the default adapter with your own. When you do, every single HTTP request that goes through Axios passes through your code first. You get full control over what happens - you can inspect the request, return a fake response, modify it, or let it pass through to the real network. The application code making the request has no idea anything has changed.

We built a custom Axios adapter package with an MSW-inspired API for registering request handlers. The core idea is simple: you register handlers for specific HTTP methods and URL patterns, and when a matching request comes in, your handler runs instead of the real network call.

The API looks like this:

import { http, HttpResponse } from './intercept-adapter';
// Register a handler for creating a user
http.post('/api/users', (params) => {
  return HttpResponse.json({ id: 'generated-uuid', status: 'draft' });
});
// Register a handler with URL parameters
http.get('/api/users/:userId', (params) => {
  const { userId } = params.params;
  return HttpResponse.json({ id: userId, firstName: 'John' });
});

If you've ever used MSW for testing, this syntax will feel familiar - that's intentional. We wanted the developer experience of writing handlers to be approachable and well-understood.

When a request comes in, the adapter extracts the HTTP method and full URL, then searches the registry for a matching handler. URL patterns support parameterized segments (:userId, :questionnaireId, etc.), which get converted to regex patterns and cached for performance. When a match is found, the adapter calls the handler function and converts whatever it returns into an Axios response that the calling code can't distinguish from a real server response.

Every handler receives a params object with three things:

  • axiosConfig - the full Axios configuration object (URL, method, headers, data, everything). Useful when you need to pass the request through to the real server.

  • request - a normalized request object with method, url, headers, and data. The data is automatically converted from snake_case to camelCase, so handlers work with the same conventions as the rest of our frontend code.

  • params - the extracted URL parameters, inspired by how Remix handles route params. For a pattern like /api/users/:userId, if the request URL is /api/users/abc-123, then params.userId will be "abc-123".

The First Handler: Creating a User

Let’s walk through what happens when a user starts the online visit in No-V mode, starting with the very first API call.

In standard mode, when a user submits their first answer, the application sends a POST request to create a draft account on the server. In No-V, this request gets intercepted by our handler:

export async function createUserHandler(params: HandlerParams) {
  const payload = params.request.data;
  validateUserData(payload, params);
  // Store metadata
  setData('visitType', payload.visitType);
  // Store user data in account storage
  const { visitType, ...accountData } = payload;
  setAccountData(accountData);
  // Generate a UUID and fake token data locally
  const visitorId = crypto.randomUUID();
  const tokenData = createTokenData();
  const userData = {
    id: visitorId,
    status: 'draft',
    firstName: '',
    lastName: '',
    email: '',
    dateOfBirth: null,
    // ... all other fields with default values
    tokenData,
  };
  setAuthData({ tokenData, userData });
  return HttpResponse.json(userData);
}

Instead of calling the backend, the handler generates a UUID locally using crypto.randomUUID(), creates fake token data, composes a response with the exact same shape as what the backend would return, stores all submitted information in memory, and returns it. The component that made the API call receives a response immediately - a response identical in shape to what the real server would have sent. All the downstream logic that processes the response (storing the token, setting up the user session, rendering the next question) runs exactly as it would in standard mode. We haven't touched any of that code. As far as the application is concerned, a draft user was just created on the server. It has no idea everything happened locally.

In-Memory Storage

At this point, I should explain what "storing in memory" actually means.

The storage is a plain JavaScript object - a key-value store scoped to a module:

const storage: Storage = {};
export function getData<T>(key: string): T | undefined {
  return storage[key] as T | undefined;
}
export function setData<T>(key: string, data: T): void {
  storage[key] = data;
}

That's it. No database, no localStorage, no IndexedDB. Just a JavaScript object that lives in the application's memory for the duration of the session. Why not use React context or Redux? Because the purpose of this data is fundamentally different from UI state. When we save a questionnaire answer to storage, we don't want to trigger a re-render anywhere. The purpose of storing this data isn't to display it on the page - it's to accumulate a payload that will be submitted at the end of the visit. React state and Redux are designed to drive what users see on the screen, and coupling our data collection to the rendering lifecycle would create unnecessary complexity and performance overhead. Scoping the storage object to the ES module that defines it also means it's naturally isolated from the rest of the application, which is a useful property when handling sensitive health data.

Collecting Answers

Once the draft account is "created" (in memory), the visit starts collecting answers. In standard mode, every answer triggers a POST request to save it to the database. In No-V, our handler catches it:


,export async function addAnswersHandler(params: HandlerParams) {
  const payload = params.request.data;
  validateAddAnswersPayload(payload, params);
  // Append this answer to the list of all answers collected so far
  const existingPayloads = getData<AnswerPayload[]>('questionnaire') || [];
  setData('questionnaire', [...existingPayloads, payload]);
 // Store state snapshot for session resume
  if (payload.stateSnapshot) {
    setData('stateSnapshot', payload.stateSnapshot);
  }
  return HttpResponse.json({});
}

Each answer is appended to an array in memory. When a user clicks the back button, standard mode sends a rollback request. Our handler simply pops the last answer off the array:

export async function rollbackAnswersHandler() {
  const existingPayloads = getData<AnswerPayload[]>('questionnaire') || [];
  if (existingPayloads.length > 0) {
    existingPayloads.pop();
    setData('questionnaire', existingPayloads);
    // Restore state from previous answer, or clear if none remain
    if (existingPayloads.length === 0) {
      setData('stateSnapshot', undefined);
    } else {
      const lastPayload = existingPayloads[existingPayloads.length - 1];
      setData('stateSnapshot', lastPayload?.stateSnapshot);
    }
  }
  return HttpResponse.json(null, { status: 204 });
}

Updating User Data

What happens next: there are several components that save user data during the flow. When a user provides their date of birth, shipping information, or other personal details, the application sends PATCH requests from many different places in the codebase. In No-V, all of these are caught by a single handler that merges the new data into the existing account object in memory. If any logic triggers a refetch of user data from the server, another handler takes over and returns whatever has been collected so far.

This is one of the beauties of the handler approach: dozens of components across the application all make the same type of API call, and we handle all of them in one place.

Handlers That Talk to the Server

Most API requests in No-V are mocked - intercepted, processed locally, fake response returned. But some handlers have more nuanced logic. Not everything can or should be handled entirely on the client side.

Conditional Pass-Through

When a user is authenticated in our application, we may need to pull back additional data from their existing account, even if they are in an “onboarding” flow. We handle these cases with conditional logic that will make a real API call when a user is authenticated, or return an empty object or a static default configuration otherwise. Here’s what this looks like in some of our region eligibility logic:


,async function validateRegionSupport(payload, params) {
  if (userHasRealAuthToken()) {
    // Authenticated user: call the real API
   const response = await defaultAdapter({
      ...params.axiosConfig,
     url: `/api/users/verify-region-change`,
      method: 'POST',
      data: JSON.stringify({ region: payload.region }),
    });
  if (!response.data.regionChangePossible) {
      throw new HttpError("Sorry, we're not in your area yet.", 400);
    }
    return;
  }
  // New user: check locally against region support snapshots
  const regionInfo = await loadSupportedRegionInfoFromSnapshots(
    payload.conditionId,
    payload.region
  );
  if (!regionInfo.isSupported) {
    throw new HttpError("Sorry, we're not in your area yet.", 400);
  }
}

Pure Pass-Through

There are also requests we always want to forward to the server. For example, allergen search and medication search - we didn't want to replicate this search logic on the client side. If these requests fail, the user isn't blocked; they can still enter information manually without autocomplete. For these, we use a simple passThrough function:

// Always pass through to real API
http.get('/api/public/search/allergens', passThrough);
http.get('/api/public/search/medications', passThrough);

Progressive Enhancement: Best-Effort Server Calls

Some operations simply cannot be faked. Phone number verification is the obvious example - you need a real backend service to send an SMS code to a real human’s phone and validate the response. We can't mock that in the browser.

We call these "best effort" handlers. They make real API calls when possible, but none of them block the user from completing the online visit if the call fails. The flow is designed so that a user can always finish, even if one of these services is down - because we know we have alternative means for the patient to complete that specific step (such as verifying a phone number) later on. 

export async function smsVerificationHandler(params: HandlerParams) {
  try {
    return await defaultAdapter(params.axiosConfig);
  } catch (error) {
    // On 400/404/422 - rethrow (validation errors should surface)
    if ([400, 404, 422].includes(error.status)) {
      throw error;
    }
    // On all other errors - return fallback "doesn't exist"
    return HttpResponse.json({
      data: {...}, // hardcoded mock SMS provider response
    });
  }
}

Strict Mode: Catching What You Don't Know You're Missing

Our custom adapter has a strictMode option. When enabled, it throws an error if any API request is made that doesn't have a matching handler. We made the decision early on to run with strict mode on, and it turned out to be one of the most valuable aspects of the entire approach. In the alternative world where we modify every place that makes an API call, we would inevitably miss some - especially calls that only fire under specific conditions, edge cases we might not even think to test, or new code added while we were working on this project in parallel. With the interceptor approach and strict mode, every unhandled request immediately throws an error. There's no silent leaking of requests to the server.

When we launched No-V for the first time in a real environment, strict mode immediately told us which endpoints were still missing handlers. We could see clearly: "this URL pattern has no handler." We'd discuss the ideal outcome and if this needed to be a static configuration or a real live call (or a required call at all), then add the new handler if needed. Without strict mode, those requests could have silently gone to the server and caused unexpected load on an area not intended to be on the critical onboarding path at Super Bowl level scale.

The Final Payload

At the end of the visit, everything the user entered - dozens of answers, personal information, payment details - is sitting safely in a JavaScript object in the browser. The composeFinalPayload() function extracts only the fields the backend worker needs:

export function composeFinalPayload(): FinalPayload {
  const finalPayload: FinalPayload = {};
  if (storage.account !== undefined) finalPayload.account = storage.account;
  if (storage.visitType !== undefined) finalPayload.visitType = storage.visitType;
  // Strip the state snapshot from each answer to reduce payload size
  if (Array.isArray(storage.questionnaire)) {
    finalPayload.questionnaire = storage.questionnaire.map((answer) => {
      const { stateSnapshot, ...answerWithoutSnapshot } = answer;
      return answerWithoutSnapshot;
    });
  }
  // ... other collected data
  return finalPayload;
}

This payload is then submitted using a multi-region submission strategy with hedging. The first request fires immediately, and if it hasn't responded within 500ms, a second request fires to another endpoint - with up to 3 retries per endpoint. This isn't about minimizing calls anymore - this is the one call that matters, and it has to succeed.

const { allSettled } = await resilientFetch(
  endpoints.map((baseURL) => ({
    path: '/api/intake/submit',
    baseURL,
  })),
  {
    hedgeDelay: 500,
    maxRetries: 3,
    axiosConfig: {
     method: 'POST',
      data: payload,
      withCredentials: true,
    },
  },
);

The entire No-V session - which in standard mode would have made 50+ API calls over the course of 5-10 minutes - collapses into one payload submission at the end.

Would We Do It Again?

At our post-Super-Bowl retrospective, someone asked: if you could start over, would you choose the same approach?

Our answer: definitely yes.

Usually when you start a project, your initial idea seems brilliant, but the more you work with it, the more you hate it. The handler approach was the opposite experience. Multiple times during development and testing, we were genuinely happy we'd made this choice.

Flows we hadn't explicitly covered would just work, because we already had handlers for the underlying endpoints with proper logic. The application doesn't know that it talks to an interceptor rather than a server - so all its built-in behavior, error handling, and edge cases work exactly as designed. And whenever we needed something different for No-V - different data, additional parameters, different behavior - it was far easier to modify a handler than to scan through every component that makes a specific API call and drill new props down to it.

Is the custom Axios adapter with request handlers the ultimate architecture to solve this problem? Probably not - but it allowed us to achieve our goals and meet our deadline without creating risk or cruft in the system. Over time, we are planning to evolve this pattern to be the core architecture pattern for all of our onboarding online visits, not just our entrypoint to the Body Program. The handler pattern sets us up for an incremental migration - we can have components read and write directly to storage and remove handlers one by one as their corresponding components are updated, until the adapter is empty and can be deleted.

But for now? The overlay approach was exactly the right call. It let us ship a system that handled Super Bowl-scale traffic, kept the existing application untouched, and let other teams keep shipping without knowing or caring that No-V existed underneath.