Reproducible Platform Dev Environments With Nix Flakes And Github Runner Cache Keys
Written by
Atlas Node
The problem I ran into
I built a “standard” dev environment for a team: same language runtimes, same system packages, same tooling versions. Locally it worked… until it didn’t.
The first time I tried to run the same environment on our CI (continuous integration—automated builds and tests), the runner (the ephemeral machine where jobs run) had to download everything again. That meant:
- slower PRs (pull requests took longer),
- higher bandwidth cost,
- occasional “works on my machine” drift when caches or lockfiles were slightly off,
- and a subtle failure mode: caching based on the wrong key.
The niche thing that ended up mattering for us: the cache key needs to include the exact Nix flake inputs (not just flake.nix), otherwise the cache can silently mismatch. I’ll show the complete setup I landed on.
What I built
A reproducible dev environment using Nix Flakes (a way to define dependencies and builds reproducibly with content-addressing). Then I wired GitHub Actions to use a cache whose key is derived from:
- the
flake.lockcontent, and - the platform system (Linux x86_64 vs ARM, etc.),
- the exact output we build (dev shell derivation).
That makes the runner cache deterministic and safe.
Repo layout
Here’s the layout I used:
. ├── flake.nix ├── flake.lock └── .github └── workflows └── ci.yml
The Nix flake: pin everything and expose a dev shell
flake.nix
{ description = "Platform Engineering example: reproducible dev shell with flake outputs"; inputs = { # A stable nixpkgs source pinned by flake.lock nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.05"; # Optional but common: flake-utils for per-system outputs flake-utils.url = "github:numtide/flake-utils"; }; outputs = { self, nixpkgs, flake-utils }: flake-utils.lib.eachDefaultSystem (system: let pkgs = import nixpkgs { inherit system; }; in { # A deterministic dev shell. Whatever your team uses, # it will always resolve to the same package versions. devShells.default = pkgs.mkShell { name = "platform-dev"; # Example toolchain: node + python packages = [ pkgs.nodejs_20 pkgs.python312 pkgs.git ]; # A tiny bit of setup so tooling behaves predictably shellHook = '' export PYTHONUNBUFFERED=1 export NODE_OPTIONS="--max-old-space-size=2048" echo "Entered reproducible dev shell on ${system}" ''; }; }); }
What this does (and why)
inputspin sources. The important part: the actual versions resolve inflake.lock.eachDefaultSystemcreates outputs for the current system (e.g.,x86_64-linux).devShells.defaultdefines the environment. That “default” shell becomes addressable in CI.
Without flake.lock (or if it’s not included correctly), you lose reproducibility.
The CI problem: caching Nix without safe cache keys
Nix caches build artifacts in directories under /nix/store and other state. GitHub Actions caching can speed that up, but only if the cache key matches the exact derivation inputs.
The failure mode I saw once:
- Someone updated
flake.lock, - but the cache key was based on
flake.nixonly, - so the runner restored an old store cache,
- and the build either failed weirdly or (worse) kept using mismatched artifacts.
So I made the cache key derived from flake.lock.
GitHub Actions workflow
.github/workflows/ci.yml
name: CI on: pull_request: push: branches: [ "main" ] jobs: nix-dev-shell-check: runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v4 # Install Nix (no magic: we need it to evaluate flakes) - name: Install Nix uses: cachix/install-nix-action@v27 with: nix_path: nix - name: Determine system and flake outputs id: nixmeta shell: bash run: | set -euo pipefail # GitHub runner is typically x86_64-linux for ubuntu-latest. # Nix uses a specific string for this. system="$(nix eval --raw --impure --expr 'builtins.currentSystem' || true)" # Get a content-derived hash from flake.lock. # This is the anchor for reproducibility: if flake.lock changes, inputs changed. lock_hash="$(sha256sum flake.lock | awk '{print $1}')" # Evaluate what dev shell derivation Nix would build for the current system. # This captures not just inputs, but also the selected output address. dev_shell_drv="$(nix eval --raw '.#devShells.default.drvPath')" echo "system=$system" >> "$GITHUB_OUTPUT" echo "lock_hash=$lock_hash" >> "$GITHUB_OUTPUT" echo "dev_shell_drv=$dev_shell_drv" >> "$GITHUB_OUTPUT" # Cache nix store artifacts. # Key includes lock_hash + system + dev shell derivation path. # # Using drvPath avoids an accidental "same lock, different output" mismatch. - name: Cache Nix store uses: actions/cache@v4 with: path: | /nix/store key: nix-store-${{ steps.nixmeta.outputs.system }}-${{ steps.nixmeta.outputs.lock_hash }}-${{ steps.nixmeta.outputs.dev_shell_drv }} # Build/check inside the dev shell. - name: Run commands in dev shell shell: bash run: | set -euo pipefail # `nix develop` enters the devShell defined in flake.nix. # It realizes the environment reproducibly from flake.lock. nix develop .#default -c bash -lc ' node --version python --version git --version '
Walkthrough of the workflow (line by line intuition)
1) Checkout
I pull the repo so flake.lock and flake.nix exist in the workspace.
2) Install Nix
I need Nix to evaluate flake outputs and run nix develop.
3) Determine system and flake outputs
This step creates three values:
system: the Nix system identifier. It matters because package closures differ by architecture/OS.lock_hash:sha256sum flake.lock. If someone updates dependencies, this hash changes.dev_shell_drv: the derivation path for.#devShells.default. Even with the same lock, if you change what the shell includes, the derivation changes.
These three values become the cache key.
4) Cache /nix/store
The cache key is:
nix-store-${system}-${lock_hash}-${dev_shell_drv}
That’s the safety guard. It ensures cache reuse only happens when the resolved environment should be identical.
5) Run inside the dev shell
nix develop .#default -c bash -lc '...' does two things:
- materializes the dev shell reproducibly
- runs the commands (
node --version, etc.) in that environment
If the environment can’t be realized from pinned inputs, this fails deterministically.
How I validated the cache correctness
I made two commits to test the “no silent mismatch” property:
- Change only a comment in
flake.nix(the lock shouldn’t change). - Update a dependency so
flake.lockchanges.
Expected behavior:
- Case 1: cache should hit because
flake.lockstays the same. - Case 2: cache should miss because
lock_hashchanges, forcing a fresh store realization.
Then I confirmed that devShells.default.drvPath changed when the environment changed.
Common footguns I avoided
- Using
flake.nixas the cache anchor: it changes without changing resolved inputs, causing needless misses. - Using only
flake.lock: sometimes multiple outputs can exist; includingdev_shell_drvprevents mismatched output caching. - Not including
system: caching an ARM store for an x86 runner (or vice versa) is a guaranteed mess.
What I learned
I learned that the “hard part” of platform-engineered reproducible environments isn’t writing the Nix flake—it’s making CI caching deterministic without lying to yourself. By deriving the cache key from flake.lock plus the exact dev shell derivation (drvPath) and the Nix system, I got fast CI runs that remain safe and reproducible as dependencies evolve.