Reproducible Platform Dev Environments With Nix Flakes And Github Runner Cache Keys

The problem I ran into

I built a “standard” dev environment for a team: same language runtimes, same system packages, same tooling versions. Locally it worked… until it didn’t.

The first time I tried to run the same environment on our CI (continuous integration—automated builds and tests), the runner (the ephemeral machine where jobs run) had to download everything again. That meant:

slower PRs (pull requests took longer),
higher bandwidth cost,
occasional “works on my machine” drift when caches or lockfiles were slightly off,
and a subtle failure mode: caching based on the wrong key.

The niche thing that ended up mattering for us: the cache key needs to include the exact Nix flake inputs (not just flake.nix), otherwise the cache can silently mismatch. I’ll show the complete setup I landed on.

What I built

A reproducible dev environment using Nix Flakes (a way to define dependencies and builds reproducibly with content-addressing). Then I wired GitHub Actions to use a cache whose key is derived from:

the flake.lock content, and
the platform system (Linux x86_64 vs ARM, etc.),
the exact output we build (dev shell derivation).

That makes the runner cache deterministic and safe.

Repo layout

Here’s the layout I used:

.
├── flake.nix
├── flake.lock
└── .github
    └── workflows
        └── ci.yml

The Nix flake: pin everything and expose a dev shell

`flake.nix`

{
  description = "Platform Engineering example: reproducible dev shell with flake outputs";

  inputs = {
    # A stable nixpkgs source pinned by flake.lock
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.05";

    # Optional but common: flake-utils for per-system outputs
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = import nixpkgs { inherit system; };
      in
      {
        # A deterministic dev shell. Whatever your team uses,
        # it will always resolve to the same package versions.
        devShells.default = pkgs.mkShell {
          name = "platform-dev";

          # Example toolchain: node + python
          packages = [
            pkgs.nodejs_20
            pkgs.python312
            pkgs.git
          ];

          # A tiny bit of setup so tooling behaves predictably
          shellHook = ''
            export PYTHONUNBUFFERED=1
            export NODE_OPTIONS="--max-old-space-size=2048"
            echo "Entered reproducible dev shell on ${system}"
          '';
        };
      });
}

What this does (and why)

inputs pin sources. The important part: the actual versions resolve in flake.lock.
eachDefaultSystem creates outputs for the current system (e.g., x86_64-linux).
devShells.default defines the environment. That “default” shell becomes addressable in CI.

Without flake.lock (or if it’s not included correctly), you lose reproducibility.

The CI problem: caching Nix without safe cache keys

Nix caches build artifacts in directories under /nix/store and other state. GitHub Actions caching can speed that up, but only if the cache key matches the exact derivation inputs.

The failure mode I saw once:

Someone updated flake.lock,
but the cache key was based on flake.nix only,
so the runner restored an old store cache,
and the build either failed weirdly or (worse) kept using mismatched artifacts.

So I made the cache key derived from flake.lock.

GitHub Actions workflow

`.github/workflows/ci.yml`

name: CI

on:
  pull_request:
  push:
    branches: [ "main" ]

jobs:
  nix-dev-shell-check:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      # Install Nix (no magic: we need it to evaluate flakes)
      - name: Install Nix
        uses: cachix/install-nix-action@v27
        with:
          nix_path: nix

      - name: Determine system and flake outputs
        id: nixmeta
        shell: bash
        run: |
          set -euo pipefail

          # GitHub runner is typically x86_64-linux for ubuntu-latest.
          # Nix uses a specific string for this.
          system="$(nix eval --raw --impure --expr 'builtins.currentSystem' || true)"

          # Get a content-derived hash from flake.lock.
          # This is the anchor for reproducibility: if flake.lock changes, inputs changed.
          lock_hash="$(sha256sum flake.lock | awk '{print $1}')"

          # Evaluate what dev shell derivation Nix would build for the current system.
          # This captures not just inputs, but also the selected output address.
          dev_shell_drv="$(nix eval --raw '.#devShells.default.drvPath')"

          echo "system=$system" >> "$GITHUB_OUTPUT"
          echo "lock_hash=$lock_hash" >> "$GITHUB_OUTPUT"
          echo "dev_shell_drv=$dev_shell_drv" >> "$GITHUB_OUTPUT"

      # Cache nix store artifacts.
      # Key includes lock_hash + system + dev shell derivation path.
      #
      # Using drvPath avoids an accidental "same lock, different output" mismatch.
      - name: Cache Nix store
        uses: actions/cache@v4
        with:
          path: |
            /nix/store
          key: nix-store-${{ steps.nixmeta.outputs.system }}-${{ steps.nixmeta.outputs.lock_hash }}-${{ steps.nixmeta.outputs.dev_shell_drv }}

      # Build/check inside the dev shell.
      - name: Run commands in dev shell
        shell: bash
        run: |
          set -euo pipefail

          # `nix develop` enters the devShell defined in flake.nix.
          # It realizes the environment reproducibly from flake.lock.
          nix develop .#default -c bash -lc '
            node --version
            python --version
            git --version
          '

Walkthrough of the workflow (line by line intuition)

1) Checkout

I pull the repo so flake.lock and flake.nix exist in the workspace.

2) Install Nix

I need Nix to evaluate flake outputs and run nix develop.

3) Determine system and flake outputs

This step creates three values:

system: the Nix system identifier. It matters because package closures differ by architecture/OS.
lock_hash: sha256sum flake.lock. If someone updates dependencies, this hash changes.
dev_shell_drv: the derivation path for .#devShells.default. Even with the same lock, if you change what the shell includes, the derivation changes.

These three values become the cache key.

4) Cache `/nix/store`

The cache key is:

nix-store-${system}-${lock_hash}-${dev_shell_drv}

That’s the safety guard. It ensures cache reuse only happens when the resolved environment should be identical.

5) Run inside the dev shell

nix develop .#default -c bash -lc '...' does two things:

materializes the dev shell reproducibly
runs the commands (node --version, etc.) in that environment

If the environment can’t be realized from pinned inputs, this fails deterministically.

How I validated the cache correctness

I made two commits to test the “no silent mismatch” property:

Change only a comment in flake.nix (the lock shouldn’t change).
Update a dependency so flake.lock changes.

Expected behavior:

Case 1: cache should hit because flake.lock stays the same.
Case 2: cache should miss because lock_hash changes, forcing a fresh store realization.

Then I confirmed that devShells.default.drvPath changed when the environment changed.

Common footguns I avoided

Using flake.nix as the cache anchor: it changes without changing resolved inputs, causing needless misses.
Using only flake.lock: sometimes multiple outputs can exist; including dev_shell_drv prevents mismatched output caching.
Not including system: caching an ARM store for an x86 runner (or vice versa) is a guaranteed mess.

What I learned

I learned that the “hard part” of platform-engineered reproducible environments isn’t writing the Nix flake—it’s making CI caching deterministic without lying to yourself. By deriving the cache key from flake.lock plus the exact dev shell derivation (drvPath) and the Nix system, I got fast CI runs that remain safe and reproducible as dependencies evolve.