Cutting the initial bundle by 25%, without lazy-loading everything

A measured approach to bundle auditing — webpack-bundle-analyzer and source-map-explorer, where the bytes actually hide, route- vs component-level splitting, and the over-splitting trap.

The initial JavaScript bundle on the Mail.ru Cloud file manager had grown the way every bundle grows: one reasonable import at a time, none of them obviously wrong. By the time I profiled it, the app shell shipped 412 KB of compressed JS — brotli, the bytes a real user actually downloads — before a single file appeared on screen. I got it to 306 KB, a hair over 25% off. Almost none of that came from React.lazy. It came from three measured changes I’ll break down at the end, and from one rule I held to the whole way through: don’t touch anything you haven’t profiled first.

Measure before you cut

Profiling comes first. Every bundle-size story that opens with “I added React.lazy everywhere” ends with a slower app and a developer who can’t explain why. Before changing a line, get a picture of where the bytes are.

Two tools, two questions. webpack-bundle-analyzer tells you what is in each chunk and how big — it reads the stats output and draws the treemap where a 90 KB date library is impossible to miss. source-map-explorer answers a sharper question: what actually shipped. It works off the production source maps, after minification and tree-shaking, so it counts the bytes that survived rather than the ones you imported.

# treemap from the real production build, not a dev bundle
npx webpack --config webpack.prod.js --profile --json > stats.json
npx webpack-bundle-analyzer stats.json -m static -r report.html

# what actually shipped, per module, from the emitted source maps
npx source-map-explorer dist/assets/*.js

Both tools report raw and gzip sizes, but your CDN almost certainly serves brotli, which is meaningfully smaller — so treat the analyzer’s number as a ranking signal for which modules are fat, and take the real baseline from the Network panel on a production build. On the file manager that baseline was 412 KB brotli, and four lines owned most of it: a PDF rendering stack, an in-browser image editor, a date library carrying every locale, and a polyfill that showed up twice.

Where the bytes actually hide

The treemap almost always tells three kinds of story, and the file manager had one of each.

Duplicated dependencies. Two copies of core-js rode in through different transitive ranges — one injected by the Babel preset, another pinned by a UI-kit dependency that had never bumped its range. You ship the polyfills twice and run them once. Forcing a single version with an overrides entry needed no code change and gave back 28 KB.

One heavy library doing a small job. That date library carried every locale it ships with for exactly one feature: the “modified 2 minutes ago” labels in file lists. Swapping it for date-fns plus the platform Intl.RelativeTimeFormat covered the same feature and took 47 KB off the entry. The biggest single wins tend to live here, and they’re usually a replacement or a narrower import — not a split.

Modules that load eagerly but run rarely. The PDF viewer and the image editor were both top-level imports in the shell, yet well under a tenth of sessions ever opened either one. They had no business sitting in the bytes that block first paint.

Split by route first, by component second

Code splitting has two natural seams, and they are not equal.

Route-level splitting is the high-leverage one. Each top-level route becomes its own chunk, loaded when the user navigates to it, so the entry bundle carries only the shell plus the landing route. With a router this is mechanical:

// route-level: the settings area and its deps never load until you go there
const Settings = lazy(() => import('./routes/Settings'));
const PdfViewer = lazy(() => import('./routes/PdfViewer'));

<Suspense fallback={<RouteSkeleton />}>
  <Routes>
    <Route path="/settings/*" element={<Settings />} />
    <Route path="/file/:id/pdf" element={<PdfViewer />} />
  </Routes>
</Suspense>

Component-level splitting is the scalpel — for a heavy thing inside an otherwise light route: a modal, a rich editor, a chart below the fold. The rule I hold to is that a component earns a split only when it is both heavy and off the critical path. A 120 KB editor that opens behind a button, yes. A 4 KB dropdown, never — the chunk overhead and the loading flicker cost more than you save.

React.lazy plus Suspense covers the React tree; for non-component code — a parser, a formatter, a worker payload — a bare dynamic import() does the same job and resolves to the module:

// defer the heavy validator until the user actually submits
async function validateUpload(file: File) {
  const { scan } = await import('./heavy/scan');
  return scan(file);
}

Every lazy boundary needs a real fallback and an error boundary around it. A chunk can fail to load on a flaky connection, and a split that white-screens on a network blip is worse than the bytes it saved.

Tune splitChunks so caches survive

Import-level splitting decides what loads when. splitChunks decides how those bytes group into files, and good grouping is mostly about cache lifetime. Application code changes every deploy; third-party code changes monthly. Mix them in one file and every deploy busts the vendor cache for no reason.

// webpack.prod.js — separate vendor from app, isolate the churny giants
optimization: {
  runtimeChunk: 'single',           // keep the webpack runtime out of every chunk
  splitChunks: {
    chunks: 'all',
    cacheGroups: {
      // big, independently-versioned libs get their own long-lived file
      react: { test: /[\\/]node_modules[\\/](react|react-dom|scheduler)[\\/]/, name: 'react', priority: 20 },
      // everything else from node_modules, shared across routes
      vendor: { test: /[\\/]node_modules[\\/]/, name: 'vendor', priority: 10 },
      // code imported by 2+ async chunks, hoisted so it loads once
      shared: { minChunks: 2, name: 'shared', priority: 5, reuseExistingChunk: true },
    },
  },
},

The react group is deliberate: it almost never changes, so it gets its own file with a stable hash that survives most deploys. The shared group catches utilities two lazy routes both pull in, so they download once instead of riding along in both chunks. Content-hashed filenames make the whole thing safe to cache for a year.

The over-splitting trap

Here is the failure mode nobody warns you about until you cause it. Split too aggressively and opening one route fires a waterfall of dependent requests: the route chunk loads, the browser parses it and only then discovers it imports a shared chunk, parses that, and only then reaches the vendor chunk. Each step is a round trip that can’t begin until the previous one has finished.

It’s tempting to blame connection overhead, but that diagnosis is a decade out of date. On HTTP/2 and HTTP/3 the browser reuses a single multiplexed connection, so setup is not the cost. The cost is discovery order: a chunk deep in the import graph can’t be requested until its parent has downloaded and been parsed, so a deep chunk tree becomes a deep, serial waterfall no matter how many requests the connection carries in parallel. You’ve traded one 200 KB download for six chained 30 KB ones, and on a mid-range phone the route is slower than before you “optimized” it.

Did it actually get faster?

Bundle size is a proxy. The thing you are actually moving is how fast the page becomes usable, and that lives in Core Web Vitals and the loading milestones around them.

Measure them the way you measured the bytes: a production build, a throttled mid-range device profile, three runs, take the median. Lab numbers from Lighthouse for the before/after delta; field data from CrUX or RUM to confirm it held for real users. A 25% smaller entry bundle that doesn’t move FCP on a real device didn’t help anyone.

What I check before I call a bundle “audited”

Before I trust a bundle result, I run down this list:

So, the numbers. 412 KB brotli down to 306 on the Mail.ru Cloud entry — about a quarter. The deduplication gave back 28 KB, the date-library swap another 47, and moving the PDF viewer and the image editor into route chunks took 31 off the path to first paint. No single lazy() carried this — it was three measured changes, each checked against the baseline before and after. And I didn’t trust any of them until FCP moved on a throttled mid-range phone, because a smaller number in the analyzer that a real user can’t feel is just a smaller number.