Bundle Splitting with Webpacker

Bundle Splitting with Webpacker

At Logikcull, we run a SPA on Rails with Webpacker as the glue to bind together our frontend and backend. When it comes to bundling our frontend assets, our approach has long been to ship out a single, large javascript file. This keeps HTTP request count low and when cached long-term, users only download code once. This works well when assets change slowly, over days or weeks.. But, for us, we’re now deploying many times a day, so users almost always have to download a new asset bundle anytime they visit the site!  But, what if I told you that with a few lines of webpack(er) config, we were able to cut the typical amount of javascript downloaded after each release by 50%?  Interested? Read on!

Code Splitting and Bundle Splitting

Our journey starts with one of Webpack’s coolest features: code splitting.  There’s a number of different ways you can go about slicing up your javascript code, and Webpack has some fantastic documentation outlining each. In this article, we’ll focus on “Bundle Splitting” (or “splitChunks” in webpack-lingo) because it offers a way for us to split up our code into separate cache-able chunks based on how frequently it changes.

Let's first look at how bundle splitting might help a traditional webapp. Typically most webapps will have two different categories of code:

  • Vendor libraries (third-party code) -- these don’t change frequently
  • Application code (first-party code) -- these do change frequently

With a single-file asset bundle, when you change something in your application code, users must re-download all vendor code as well. For example, the React library didn't change, and likely won't for months, why make them re-download that? That's where bundle splitting comes in. If we can partition our asset files by those boundaries, we could cache our vendor code and only ship a new version of our application code (for most releases):

  • Vendor.js (all our third party library code in `node_modules/`. for us, about ~1MB gzipped)
  • Application.js (all our code in `src/`. for us, about ~1MB gzipped)

Making the magic happen

So how do we tell Webpack/Webpacker to make this happen?  First we have to grok Webpacker, which provides a layer of abstraction on top of Webpack. I hear you, fellow developer, you say, “Webpack is already confusing enough, and we’re going to add more abstraction?”  And to that I say, “You’re not wrong.”  But hang in there, we'll step through it piece by piece so you know exactly what is going on.

To control bundle splitting, Webpack offers the SplitChunksPlugin.  The documentation for that plugin is pretty good over at the Webpack docs site.  I encourage you to double check all the settings that I propose in this article for yourself. The configuration we need to add is to the `optimization` section of the Webpack configuration object.  But how do we do that in Webpacker?

Something that I’ve found useful when configuring Webpacker, is to start by looking at the out-of-the-box configuration. Webpacker sets up a lot for you right after you run `rake webpacker:install`, but much of it hides inside the library's source code. A great place to start is by looking at package/environments/base.js in the `@rails/webpacker` npm package..  In there, we find the splitChunks function:

The defaultConfig object is what we’re primarily interested in.  It provides some sensible default values, but in most cases, it will not chunk out your javascript the way the comment describes.  By default, splitChunks tries to extract commonly included modules into separate bundles. In our experience, it wasn’t very reliable.  We use the `cacheGroups` property to define our own behavior. The sample below splits code under `/node_modules/` into a separate vendor.js file:

splitChunks: {

   chunks: 'all',

   cacheGroups: {

     vendor: {

       test: /[\\/]node_modules[\\/]/,

       name: 'vendor',

       chunks: 'all',

     },

   },

How it works:

As Webpack compiles your asset bundle, it tests every file’s path against the regular expression that you provide.  Files that pass go into a new file named `vendor`.  All other code goes into your main bundle file like normal.  Great!  We’re almost there. Now let’s figure out how to configure Webpacker to make this happen. 

We can use the `splitChunks` function that was shown in the webpacker source snippet.  If you parse through the source, you’ll see that you can pass a callback function that’s merged with the overall Webpack configuration hash.  That means that you can add this configuration to your webpacker config/webpack/environment.js file (that you control) and you will be good to go:

In the sample above, I’ve listed a couple of extra properties that work well for us and I’d encourage you to read up on them in the Webpack docs.  With that, you’re 90% of the way there.  

Almost there

Once you boot Webpack via `webpack-dev-server`, you should notice that Webpack is now serving two javascript files!  If you’re worried about any sort of race conditions, don’t be! Webpack ensures that all chunks are fully-loaded before executing your code.  You’re pretty close at this point to stamping “ShipIt” all over your config change and reaping the benefits of bundle splitting, but, beware! After a deployment where you only changed application code, your vendor.js bundle will still have to be re-downloaded.  What gives?

It turns out that Webpack determines the hash for the filename of your javascript bundle (e.g. vendor-bcdfe242232.js) by default using an algorithm that depends on how it iterates over your javascript modules.  What this means in practice is that sometimes Webpack will generate a new bundle even if none of the code in that bundle changed.  The reasoning behind why it operates this way by default is unclear, but to fix it, you need to change how Webpack determines the module ids within your bundle.  The proper fix can be rather complicated - but you can get most of the way there by specifying the `optimization.moduleIds: ‘hashed’` setting in your Webpack config.  For those looking for a full-proper fix, it might look like including and using the reliable-module-ids-plugin.  

The good news is that when we get onto Webpack 5 this will all get substantially simpler: there’s a new moduleIds: ‘deterministic’ setting that was created to address this exact problem.

With that in mind, your final webpack splitChunks config might look something like this:

optimization: {

 moduleIds: 'hashed',

 runtimeChunk: 'single',

 splitChunks: {

   chunks: 'all',

   cacheGroups: {

     vendor: {

       test: /[\\/]node_modules[\\/]/,

       name: 'vendor',

       chunks: 'all',

     },

   },

 }

And that’s it!  You should be able to change code in one bundle and verify that the other bundle is served from your browser cache. Be sure to correctly configure asset caching headers too!

Wrap up

As mentioned earlier, this approach led to a 50% reduction in asset download size for returning users to our application after a release (assuming library code did not change).  Let’s recap the steps:

  1. Identify areas of your codebase that change more/less frequently than others.
  2. Use the `cacheGroups` configuration block (inside splitChunks) to instruct webpack to create separate javascript files along those boundaries.
  3. Make sure module ids are stable or your vendor code may still have to be redownloaded after a release.  You can use one of the following mechanisms to achieve this:
    - Specify `optimiazation.moduleIds` as `hashed`
    - Use the reliable-module-ids-plugin
  4. You’re done!

4,500+ legal professionals love our newsletter, where they get the latest tech and discovery news, case law, best practices, and more!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Follow Logikcull on social media

Logikcull + MS365

Logikcull integrates seamlessly with Office 365 for incredibly fast, always reliable cloud-to-cloud eDiscovery.

logikcull + ms365

Related articles

No items found.