Ansible is YAML, but it's definitely imperative YAML - each YAML file is a list ...

roman_sf · on May 30, 2020

I agree. It is impressive how much it can orchestrate. It is also very useless in the real cloud because developers there are dealing with higher-level abstractions to solve problems for the business.

The most simplistic task - execute some code in response to even in a bucket - makes kubernetes with all its sophisticated convergence capabilities completely useless. And even if somebody figures this out and puts the opensource project on github to do this on kubernetes - it just going to break at slightest load.

Not to mention all the work to run kubernetes at any acceptable level of security, or keep the cost down, do all patching, scaling, logging, upgrades... Oh, the configuration management itself for kubernetes? Ah sorry, I forgot, there are 17 great open-source projects exists :)

demosito666 · on May 30, 2020

> The most simplistic task - execute some code in response to even in a bucket - makes kubernetes with all its sophisticated convergence capabilities completely useless.

That's because you're not thinking web^Wcloud scale. To execute some code in response to event you need:

- several workers that will poll the source bucket for changes (of course you could've used existing notification mechanism like aws eventBridge, but that will couple you k8s to vendor-specific infra, so it kinda deminishes the point of k8s)

- distributed message bus with persistanse layer. Kafka will work nicely because they say so on Medium, even though it's not designed for this use case

- a bunch of stateless consumers for the events

- don't forget that you'll need to write processing code with concurrency in mind because you're actually executing it in truly destributed system at this point and you've made a poor choice for your messaging system

roman_sf · on May 30, 2020

Wait, I can do all these with s3 and lambda at any scale - for pennies :) Will probably take few hours to set everything up with tools like stackery.io

So once again, why developers need kubernetes for? If the most simple problem becomes a habitholy mess :)

fastball · on May 29, 2020

How does K8s know what order to do things in if there aren't steps? Because on a system, certain things obviously need to happen before other things.

_skel · on May 29, 2020

You can save your Kubernetes manifests in any order. Stuff that depends on other stuff just won't come up until the other stuff exists.

For example, I can declare a Pod that mounts a Secret. If the Secret does not exist, the Pod won't start -- but once I create the Secret the pod will start without requiring further manual intervention.

What Kubernetes really is, under the hood, is a bunch of controllers that are constantly comparing the desired state of the world with the actual state, and taking action if the actual state does not match.

The configuration model exposed to users is declarative. The eventual consistency model means you don't need to tell it what order things need to be done.

jfkebwjsbx · on May 30, 2020

That is Puppet, too. But Puppet was easy. K8s isn’t.

geofft · on May 29, 2020

A combination of things, mostly related to Kubernetes' scope and use case being different from Ansible/CFEngine/etc. Kubernetes actually runs your environment. Ansible/CFEngine/etc. set up an environment that runs somewhere else.

This is basically the benefit of "containerization" - it's not the containers themselves, it's the constraints they place on the problem space.

Kubernetes gives you limited tools for doing things to container images beyond running a single command - you can run initContainers and health checks, but the model is generally that you start a container from an image, run a command, and exit the container when the command exits. If you want the service to respawn, the whole container respawns. If you want to upgrade it, you delete the container and make a new one, you don't upgrade it in place.

If you want to, say, run a three-node database cluster, an Ansible playbook is likely to go to each machine, configure some apt sources, install a package, copy some auth keys around, create some firewall rules, start up the first database in initialization mode if it's a new deployment, connect the rest of the databases, etc. You can't take this approach in Kubernetes. Your software comes in via a Docker image, which is generated from an imperative Dockerfile (or whatever tool you like), but that happens ahead of time, outside of your running infrastructure. You can't (or shouldn't, at least) download and install software when the container starts up.

You also can't control the order when the containers start up - each DB process must be capable of syncing up with whichever DB instances happen to be running when it starts up. You can have a "controller" (https://kubernetes.io/docs/concepts/architecture/controller/) if you want loops, but a controller isn't really set up to be fully imperative, either. It gets to say, I want to go from here to point B, but it doesn't get much control of the steps to get there. And it has to be able to account for things like one database server disappearing at a random time. It can tell Kubernetes how point B looks different from point A, but that's it.

And since Kubernetes only runs containers, and containers abstract over machines (physical or virtual), it gets to insist that every time it runs some command, it runs in a fresh container. You don't have to have any logic for, how do I handle running the database if a previous version of the database was installed. It's not - you build a new fresh Docker image, and you run the database command in a container from that image. If the command exits, the container goes away, and Kubernetes starts a new container with another attempt to run that command. It can do that because it's not managing systems you provide it, it's managing containers that it creates. If you need to incrementally migrate your data from DB version 1 to 1.1, you can start up some fresh containers running version 1.1, wait for the data to sync, and then shut down version 1 - no in-place upgrades like you'd be tempted to do on full machines.

And yeah, for databases, you need to keep track of persistent storage, but that's explicitly specified in your config. You don't have any problems with configuration drift (a serious problem with large-scale Ansible/CFEngine/etc.) because there's nothing that's unexpectedly stateful. Everything is fully determined by what's specified in the latest version of your manifest because there's no other input to the system beyond that.

Again, the tradeoff is this makes quite a few constraints on your system design. They're all constraints that are long-term better if you're running at a large enough scale, but it's not clear the benefits are worth it for very small projects. I prefer running three-node database clusters on stateful machines, for instance - but the stateless web applications on top can certainly live in Kubernetes, there's no sense caring about "oh we used to run a2enmod but our current playbook doesn't run a2dismod so half our machines have this module by mistake" or whatever.

closeparen · on May 30, 2020

It is common to have significant logic and complexity in the configuration management manifests, but I'd argue that it's possible to move most of that to packaging and have your configuration management just be "package state latest, service state restarted."