Docker Compose standardization

Over two and a half years, my lab grew from a handful of containers to around forty self-hosted services spanning media, productivity, monitoring, authentication, and networking. Each service started life as a fresh docker-compose.yaml written at the moment of deployment – copy-pasted from whichever similar service I'd set up most recently, tweaked to fit, and forgotten about the moment it was running.

None of it was broken. All of it was drifting.

The problem

Every service I added was subtly different from the ones before it. The Traefik labels on my Jellyfin stack used one naming convention; the ones on Immich used another. My Homepage widget labels were consistent in spirit but inconsistent in detail – some had descriptions, some didn't, some referenced an external icon library, some used emoji. Adding a new service meant opening three existing compose files, mentally diffing them, and hoping I hadn't introduced a typo that would surface hours later as a broken route or a missing healthcheck.

The individual services weren't hard. The system of services was.

A few concrete signs of trouble:

  • Updating a shared pattern was a nightmare. When Traefik v3 changed how you reference services in router labels, I needed to update every one of my ~30 Traefik-exposed containers. There was no single place to make the change; it was thirty individual file edits, each of them a chance to introduce a typo.
  • Onboarding new services was slow and error-prone. Adding a service meant 15-20 minutes of copy-paste-tweak against a working template, plus a 30% chance I'd miss something and spend another 15 minutes debugging why the router wasn't picking up the container.
  • Config drift across services was impossible to detect. There was no canonical "what should a service look like" – just a growing collection of slightly different snowflakes, all nominally working.

None of this was urgent. It was the kind of slow decay that never forces you to fix it, but keeps costing you a little friction every time you touch anything.

The constraints

A few things made this harder than a greenfield rewrite:

  1. No downtime. These are services I actively use every day – AdGuard DNS, Vaultwarden, Immich, Paperless-ngx, the whole stack. I couldn't tear everything down and start over. The migration had to be incremental, one service at a time, on a running system.
  2. Had to stay with plain docker compose. I wanted to preserve the ability to deploy any service with docker compose up -d from its own directory. No Kubernetes yet (that's a separate project), no Ansible layer, no custom tooling. Whatever pattern I chose had to work inside Compose's own feature set.
  3. Preserve per-service flexibility. Services have different needs – Frigate needs GPU passthrough, Authentik needs a Postgres sidecar, Traefik needs the Docker socket. The standardization had to reduce duplication without forcing every service into a rigid mold.

The approach: YAML anchors

Docker Compose has a feature most people never touch: native YAML anchor support in the x-* top-level extension fields. You can define reusable blocks at the top of a compose file, then merge them into services with <<: [*anchor1, *anchor2].

This turned out to be exactly the tool for the job. I could define the shared label patterns once at the top of each compose file, reference them from the service definition, and keep the service block itself focused on the actually-unique parts (image, volumes, environment, networks).

Three anchors, one per labeling system:

x-traefik-labels: &traefik-template
  traefik.enable: true
  traefik.http.routers.jellyfin-http.rule: Host(`${APP_NAME}.${BASE_DOMAIN}`)
  traefik.http.routers.jellyfin-http.entrypoints: web
  traefik.http.routers.jellyfin-http.middlewares: redirect-to-https
  traefik.http.middlewares.redirect-to-https.redirectscheme.scheme: https
  traefik.http.routers.jellyfin-https.rule: Host(`${APP_NAME}.${BASE_DOMAIN}`)
  traefik.http.routers.jellyfin-https.entrypoints: websecure
  traefik.http.routers.jellyfin-https.tls: true
  traefik.http.routers.jellyfin-https.tls.certresolver: cloudflare
  traefik.http.routers.jellyfin-https.service: ${APP_NAME}
  traefik.http.services.jellyfin.loadbalancer.server.port: ${APP_PORT}

x-homepage-labels: &homepage-template
  homepage.group: ${HOMEPAGE_GROUP}
  homepage.name: ${HOMEPAGE_NAME}
  homepage.icon: ${HOMEPAGE_ICON}
  homepage.href: https://${APP_NAME}.${BASE_DOMAIN}
  homepage.description: ${HOMEPAGE_DESCRIPTION}

x-kuma-labels: &kuma-template
  kuma.jellyfin.http.name: ${HOMEPAGE_NAME}
  kuma.jellyfin.http.url: https://${APP_NAME}.${BASE_DOMAIN}
  kuma.jellyfin.http.max_retries: 3
  kuma.jellyfin.http.interval: 60

services:
  jellyfin:
    container_name: ${APP_NAME}
    image: jellyfin/jellyfin:${IMAGE_TAG:-latest}
    restart: unless-stopped
    volumes:
      - ./config:/config
      - /mnt/media:/media:ro
    networks:
      - frontend
    labels:
      <<: [*traefik-template, *homepage-template, *kuma-template]

networks:
  frontend:
    external: true

All the per-service configuration lives in a sibling .env file:

# General Application Settings
APP_NAME=jellyfin
APP_PORT=8080
IMAGE_TAG=latest

# Domain/Network Settings
BASE_DOMAIN=domain.tld

# Homepage Dashboard Integration
HOMEPAGE_GROUP=Media
HOMEPAGE_NAME=Jellyfin
HOMEPAGE_ICON=jellyfin.png
HOMEPAGE_DESCRIPTION=Media streaming

Adding a new service is now: copy the template compose file, copy a template .env, edit the .env, docker compose up -d. Done. Maybe three minutes if the service has a straightforward image.

The gotcha that cost me some time

There's a subtle Docker Compose limitation you don't find until you trip over it: Compose interpolates environment variables in label values, but not in label keys.

In plain English: you can write traefik.http.routers.jellyfin-https.rule: Host(${APP_NAME}.${BASE_DOMAIN}) and the variables on the right side get substituted at runtime. But if you try to template the left side – something like traefik.http.routers.${APP_NAME}-https.rule: – Compose treats the literal string ${APP_NAME} as part of the label key and passes it through unchanged. Your router ends up registered as ${APP_NAME}-https and Traefik never matches a request to it.

I hit this the first time I tried to make the Traefik anchor fully generic. I wanted one canonical x-traefik-labels block I could paste into every compose file without editing the hostnames. It was frustrating to realize that wasn't possible – Compose's interpolation rules simply don't reach into label keys.

The workaround is ugly but pragmatic: hardcode the service name inside the anchor keys, but use env vars for everything in the values. So traefik.http.routers.jellyfin-http.rule stays as literal jellyfin-http on the key side, but Host(\${APP_NAME}.${BASE_DOMAIN}) on the value side is templated. When I copy the template for a new service, I do a single find-and-replace from jellyfin to newservice in the anchor keys – everything else is driven by the .env.

It's a compromise, not a victory. The fully generic version I wanted would require switching to Docker Compose's include: directive with an external YAML file (Compose 2.20+) or moving off Compose entirely. For now, a 10-second find-and-replace is an acceptable trade for keeping the simplicity of self-contained compose files.

Automating the last-mile rollout

After I'd standardized the core Traefik and Homepage labels, I added a new labeling system for Uptime Kuma – my monitoring layer – using AutoKuma, which auto-discovers services from Docker labels. I needed to retrofit Kuma labels across every compose file that had Traefik labels already.

Hand-editing 30 files sounded miserable, so I wrote a small bash script that walked /apps/, found every compose file with Traefik labels, read the APP_NAME from the sibling .env, and injected a new x-kuma-labels anchor block plus the merge reference in the service's labels: section. I git-committed every .env and compose file first, ran the script, reviewed the diff, and redeployed everything in a single loop:

for d in /apps/*/; do (cd "$d" && docker compose up -d); done

AutoKuma picked up every new monitor within a minute. Thirty services, monitored, in under ten minutes of wall-clock time. The script is in the lab repo.

The outcome

A few concrete improvements I can point to:

  • Adding a new service: ~3 minutes instead of 15-20. The reduction came entirely from eliminating the "copy from an existing one and mentally diff" step.
  • Changing a shared pattern is trivial. When I eventually needed to update Traefik router label syntax again, I did it in the template, regenerated a few derived pieces, and moved on.
  • Config drift is impossible. Because the shared patterns are defined as anchors in each file, there's a single canonical source per compose file, and a single canonical template across all compose files. Any drift is visible in a git diff against the template.
  • The setup is ready for what comes next. The whole point of doing this now was to make the upcoming k3s migration tractable — Kubernetes manifests map more cleanly from a uniform Compose baseline than from 40 snowflakes. Without standardization first, a Compose-to-k3s migration would have been two projects fighting each other.

What I'd do differently

The honest answer, with hindsight: I should have done this at service #5, not service #40.

Standardization has a scaling cost. Standardizing 5 services is an afternoon. Standardizing 40 is a long weekend plus a week of tweaking stragglers. The cost scales with the number of things that already exist and need migrating, while the benefit – predictable, maintainable infrastructure – scales with the number of things you'll have in the future. The earlier you do it, the better the ROI.

I also wish I'd standardized what went in the .env files earlier. Even after the compose templates were consistent, I had leftover inconsistencies in which environment variables were named what – APP_NAME vs SERVICE_NAME, BASE_DOMAIN vs DOMAIN, etc. Cleaning those up was a second pass that could have been the same pass.

And if I were starting a new lab from scratch today, I'd probably skip this Compose-level standardization entirely and go straight to Kubernetes with FluxCD and Helm charts. The standardization effort this case study describes is, in a sense, a way of retrofitting the benefits of a declarative config system onto a tool (Compose) that wasn't designed for it. Kubernetes gives you those benefits natively. But that's only the right answer if you already know Kubernetes – and for the ~2 years I've been running this lab, the right answer has been "make Compose work better," not "switch tools."

Tech stack

Docker Compose, Traefik v3, Homepage, Uptime Kuma + AutoKuma, YAML anchors, Bash (for the automation script), Cloudflare (DNS-01 certresolver), Git (for tracking all of it). Full repo: github.com/hexiejexie/homelab.

The takeaway for client work

The reason I'm writing this up is that the underlying principles – templatize what's shared, parameterize what varies, automate the tedious rollout, and do it early – apply to production infrastructure at every scale. The same pattern works for fleets of network devices, CI/CD pipelines across repos, monitoring configs across services, Kubernetes manifests across namespaces, and Ansible playbooks across hosts. The tool changes; the discipline doesn't.

If your team is sitting on a growing pile of different configuration files and the thought of touching any of them makes someone wince, that's the signal it's time to standardize. I can help you do it incrementally, without downtime, without forcing a tooling rewrite. Get in touch if this sounds like your situation.

Get in touch →