Just say no to :latest
2022-Mar-02 • by David Norton
Don't specify latest
in your Dockerfile! Or anywhere else! Do you want to
live in a van down by the river?
FROM alpine:latest
It breaks one of the core requirements of continuous delivery: reproducible, idempotent builds. This can cause problems at best when trying to build your project, and at worst in a production failure.
Perhaps worse than specifying latest
in a Dockerfile, we definitely don't want to specify latest
in a Kubernetes
Pod manifest. At least if you use the latest
in your
Dockerfile to create a versioned image, you could roll back to your previous versioned image if something happened.
If your deployment manifest specifies a latest
image, then it could update any time a new pod needed to roll out, and you would be
at the mercy of the maintainers to not break compatibility. This could happen on a weekend or the middle of the night,
when a node goes bad!
# BAD:
image: "nginx:latest"
# GOOD:
image: "nginx:1.21.6"
(This brings up an interesting side point, in that Docker Hub and most other registries allow mutable tags by default.
So nginx:1.21.6
might not be the same image today as it was yesterday. In reality, you probably need a mechanism to
enforce tag immutability: e.g., your own registry mirror, or referring to images by SHA).
Latest dependencies exist in most ecosystems
You can have unversioned dependencies in your installation script:
# BAD:
pip install awscli
# GOOD:
pip install awscli==1.22.60
Or in your package.json:
# BAD if you don't have a lock file
"dependencies": {
"baz": ">1.0.2"
}
Or your Terraform provider:
# BAD if you don't have a lock file
terraform {
required_providers {
mycloud = {
source = "hashicorp/aws"
version = ">= 1.0"
}
}
}
Or your Terraform module:
# BAD:
module "gitlabrunner" {
source = "npalm/gitlab-runner/aws"
}
# GOOD:
module "gitlabrunner" {
source = "npalm/gitlab-runner/aws"
version = "1.2.3"
}
There are any other number of ways this can play out. Whenever you pull in external code or binaries, consider how that dependency is versioned and how your build process pulls it in.
Lock files
Thankfully, many frameworks provide a mechanism to allow easy updates, with source-controlled versions and hashsums, by
way of lock files. These lock files are intended to be generated with a specific command, and committed to source
control. They are then used in CI to pull the exact the same dependencies at build time. For example,
terraform init -upgrade
will pull in the latest dependencies allowed by the version constraints and update the
lock file, and later terraform init
will pull in those exact same versions.
I think this provides the best of both worlds -- the latest and greatest with more permissive provider version constraints, with the predictability of fixed dependency versions.
Take advantage of these wherever you can, but remember two things:
- Commit the lock files to source control!
- If you need to take advantage of a new feature, bug fix, or security fix, update the provider version constraint,
run
terraform init -upgrade
, and commit the updated lock file. - Do not update the lock files during CI (e.g. run
terraform init
, notterraform init -upgrade
)
Examples of lock files:
- Terraform:
.terraform.lock.hcl
- Python:
Pipfile
using Pipenv - Node/Yarn:
yarn.lock
- Go: hashsummed and versioned by default with
go.sum
andgo.mod
Pulling dependencies at runtime
As much as you can, avoid pulling dependencies at runtime. This may look like an EC2 user-data script that installs
Docker, or an npm install
running at startup on a virtual machine, or others. Bake your
dependencies into your deployable artifact, and version your artifacts, so that you always have a deployable system
and can track down issues at the right time.
One way to prevent this is with network policies in your compute environment that prevent access to code distribution mechanisms (or you could even lock down all outbound access except for allowed connections).
Scanning for vulnerabilities
GitHub and GitLab both have features that will scan your repositories and suggest updates based on security vulnerability databases. You can also take advantage of other commercial services such as Twistlock, or open source solutions such as Grype. These can integrate into your build process, and/or be run on a schedule to catch new reports as they occur.
Updating dependencies
You should update your dependencies, but it should be with a discrete commit to source control so that you can track the changes, get a versioned artifact, and be able to catch any issues before they become problems in production.
Conclusion
This post was originally titled :latest literally kills puppies
. My editor/wife thought that was a little extreme.
I took her advice, but added this note indicating my personal feelings on the subject - you can draw your own
conclusions.