> the Dockerfile has continued because of its flexibility
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.
> Puppet or terraform or something more declarative would have been such a better alternative
Until you need to do something that isn't covered with its DSL, and you extend it with an external command execution declaration... At which point people will just write bash scripts anyway and use your declarative language as a glorified exec.
If you have 90-95% of everyone's needs (installing packages, compiling, putting files) covered in your DSL, and it has strong consistency and declarativeness, it's not that big of a problem if you need an escape hatch from time to time. Terraform, Puppet, Ansible, SaltStack show this pretty well, and the vast majority of them that isn't bash scripts is better and more maintainable than their equivalents in pure bash would be.
The problem is, ironically, that each DSL has its own execution platform, and is not designed for testability. Bash scripts may be hard to maintain, but at least you can write tests for them.
In Azure YAML I had an odd bug because I used succeeded() instead of not(failed()) as a condition. I had no way of testing the pipeline without executing it. And each DSL has its own special set of sharp edges.
However, Dockerfiles are so popular because they run shell commands and permit 'socially' extending someone else shell commands; tacking commands onto the end of someone else's shell script is a natural process. /bin/sh is unreasonably effective at doing anything you need to a filesystem, and if the shell exposes a feature, it has probably been used in a Dockerfile somewhere.
Every other solution, especially declarative ones, tend to come up short when _layering_ images quickly and easily. However, I agree they're good if you control the entire declarative spec.
I'd say LLB is the "standard", Dockerfile is just one of human-friendly frontends, but you can always make one yourself or use an alternative. For example, Dagger uses BuildKit directly for building its containers instead of going through a Dockerfile.
Give https://github.com/project-dalec/dalec a look.
It is more declarative. Has explicit abstractions for packages, caching, language level integrations, hermetic builds, source packages, system packages, and minimal containers.
Its a Buildkit frontend, so you still use "docker build".
Bash is pretty darn abstracted from the OS, though. Puppet vs Bash is more about abstraction relative to the goal.
If your dockerfile says “ensure package X is installed at version Y” that’s a lot clearer (and also more easy to make performant/cached and deterministic) than “apt-get update; apt-get install $transitive-at-specific-version; apt-get install $the-thing-you-need-atspecific-version”. I’m not thrilled at how distro-locked the shell version makes you, and how easy it is for accidental transitive changes to occur too.
But neither of those approaches is at a particularly low abstraction level relative to the OS itself; files and system calls are more or less hidden away in both package-manager-via-bash and puppet/terraform/whatever.
Dockerfile has the flexibility to do what you want though, no? Use a base image with terraform or puppet or opentofu or whatever pre-installed, then your Dockerfile can just run the right command to apply some declarative config file from the build context.
And if you want something weird that's not supported by your particular tool of choice, you have the escape hatch of running arbitrary commands in the Dockerfile.
The loose integration between the declarative tools and the container build system drags down performance and creates a lot of footguns re: image size and inert declarative-build-system transitive deps left lying around, I’ve found.
Why would terraform leave transitive steps around? To my knowledge, Docker doesn't record a log the IO syscalls performed by a RUN directive, the layer just reflects the actual changes it makes. It uses overlayfs, doesn't it? If you create a temporary file and then delete it within the same layer, there's no trace that the temporary file ever existed in overlayfs, correct?
I'd get your worry if we were talking about splitting up a terraform config and running it across multiple RUN directives, but we're not, are we?
Random examples off the top of my head: Puppet has a ton of transitive Ruby libraries and config files/caches that it leaves around; Terraform leaves some very big provider caches on the system; plan or output files, if generated and not cleaned up, can contain secrets; even the “control group” of the status quo with RUN instructions often results in package manager indexes and caches being left in images.
Those are all technically user error (hence why I called them footguns rather than defects), but they add up and are easy mistakes to make.
Oof, not terraform please. If you use foreach and friends, dependency calculations are broken, because dependency happens before dynamic rules are processed.
I'd get much better results it I used something else to do the foreach and gave terraform only static rules.
Do you mean that if you use a dynamic output in a foreach, Terrafom can error? Or are you referring to “dynamic” blocks and their interactions with iterators?
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.