Despite its reputation, Kubernetes is actually quite easy to master for simple use cases. And affordable enough for more complex ones.
The fundamental abstractions are as simple as they can be, representing concepts that you'd already be familiar with in a datacenter environment. A cluster has nodes (machines), and you can run multiple pods (which is the smallest deployable unit on the cluster) on each node. A pod runs various types of workloads such as web services, daemons, jobs and recurring jobs which are made available (to the cluster) as docker/container images. You can attach various types of storage to pods, front your services with load-balancers etc.
All of the nouns in the previous paragraph are available as building blocks in Kubernetes. You build your complex system declaratively from these simpler parts.
When I look at Nomad's documentation, I see complexity. It mixes these basic blocks with those from HashiCorp's product suite.
I couldn't disagree with this more. I manage and run Kubernetes clusters as part of my job, and I can tell you that configuring, running and installing these clusters is no small feat. I don't know much about Nomad, but I would highly encourage most users to not think K8S is simple by any standard. Multiple cloud providers now provide ways to run your code directly, or your containers directly. Unless you have a reason to manage that complexity, save yourself some heartburn and make your developers happy that they just need to worry about running their code and not what happens when a seemingly innocuous change might cause a cascading failure.
This has some great account of war stories - https://k8s.af/
Honestly if you use a managed kubernetes provider it's pretty simple once you nail the concepts. Though you'll get hit with cloud provider issues every now and then but it's really not terrible.
I'd manage services in Kubernetes before I start deploying VMs again, that's for sure.
Sure, you pay someone else to keep k8s alive, it's not so bad, but it's expensive to do that. You generally need a full-time team of people to keep a k8s deployment alive if you are running it yourself.
I keep Nomad alive part-time. It's operationally simple and easy to wrap ones head around it and understand how it works.
A lot of the cloud providers don't bill directly for kubernetes management, instead it's just the node resources.
Either way, as another comment points out, Rancher and many other solutions make the orchestration of creating your own Kubernetes cluster really boring.
We run a few kubernetes clusters on premise, and for the longest time it was just 1 person running some Kubernetes clusters.. we even have other teams in QA/Engineering running their own with it.
RKE2 is the closest thing we've found to a turnkey, on prem, production ready Kubernetes distribution. Though we were last looking about a year ago so I concede there might be better or comparable options now.
Almost everything is pretty simple "once you nail the concepts". It's getting to the point where you have the concepts nailed that is the measuring stick for complexity.
> Despite its reputation, Kubernetes is actually quite easy to master for simple use cases.
This has so many assumptions rolled up in it. When we move from niche assumptions to the masses of app operators and cluster operators (two different roles) it tends to fall apart.
Which ultimately leads to things like...
> This has some great account of war stories - https://k8s.af/
We shouldn't assume other people are like us. It's worth understanding who folks are. Who is that ops person who wants to run Kubernetes at that insurance company or bank? Who is the person setting up apps to run for that auto company? What do they know, need, and expect?
When I answer these questions I find someone far different from myself and often someone who finds Kubernetes to be complicated.
I also maintained several kubernetes clusters and never found it very difficult. We never had any major outages, and the only downtime was trying new features.
What's the scale of your clusters? If you're running anywhere between 5 - 20 nodes with a 10 pods on each node. I think that's a very small deployment and you'll be able to breeze by with most of the defaults. You still need to configure the following though - logs, certificates, authentication, authorization, os upgrades, kubernetes upgrades, etc.
I'm not sure if all of this is worth it, if you're running a small footprint. You're better of using more managed solutions available today.
I dunno man, k8s is pretty simple, but only if you make it so, I have built (early on) complicated setups with all kinds of functionality, but the complexity was rarely used. I now manage clusters and deployments on those clusters with code, terraform to be exact (tcl expect for instances when terraform can't hang yet) I built a gui interface for terraform to manage deployments of kubernetes on various clouds and bare metal and to manage the deployments To kubernetes. \n
Maybe one day this will get too complex and we'll do something else but so far so good.
From a system architecture perspective, kubernetes is very complex since it handles a multitude of complexities, but that's why it _can_ be very simple from a user's perspective. Most of the complexity can be ignored (until you want it).
It's the same reason I still like to use postgres when I can versus NoSQL until I know I need the one feature I may not be able to achieve with postgres: automatic sharding for massive global scale. The rest of the features postgres (and friends) give easily (ACID etc) are very tricky to get right in a distributed system.
It's also the same reason bash is great for tasks and small automated programs, but kind of terrible for more complex needs. The primitives are super simple, easy to grok, and crazy productive at that scale, but other languages give tools to more easily handle the complexities that start to emerge when things get more complicated.
> From a system architecture perspective, kubernetes is very complex since it handles a multitude of complexities, but that's why it _can_ be very simple from a user's perspective.
I’ve been doing software for 20 years in one form or another. One of the things I’ve learned is the simpler and more polished something seems to the user, it is almost always because there is a hell of a lot of complexity under the covers to make it that way.
Making something that handles the 80% is easy. Every step closer to 100% becomes non-linear in a hurry. All that polish and ease of use took months of “fit & finish” timeboxes. It took tons of input from UX and product. It involved making very hard decisions so you, the user, don’t have to.
A good example is TurboTax online. People love to hate on their business practices (for good cause) but their UX handles like 98% of your common tax scenarios in an incredibly easy to use, highly polished way. Robinhood does a pretty good job too, in my opinion—there is a lot of polish in that app that abstracts away some pretty complex stuff.
It doesn't take long working with Nomad to hit the cases where you need to augment it. Now I know some of people enjoy being able to plug and play the various layers that you get in the complicated kitchen sink Kubernetes.
We already had something that just ran containers and that was Mesos. They had the opinion that all the stuff like service discovery could be implemented and handled by other services like Marathon. But it did not work well. Most people that was to deploy containers in an orchestrated manner want service discovery.
At least these parts are being provided by the same company (Hashicorp) so it probably won't suffer the same lack of coordination between separate projects that Mesos did.
The benefit to the kitchen sink opinionated framework that K8s does is that your deployments and descriptors are not very difficult to understand and can be shared widely. I do not think the comparison of massively sharded NoSQL to Postgres is the same because most people will not need massive sharding, but almost everyone is going to need the service discovery and other things like secrets management that K8s provides.
One of the things that I don't like about Nomad is HCL. It is a language that is mainly limited to HashiCorp tools and there's no wider adoption outside at least not to my knowledge.
From the documentation:
> Nomad HCL is parsed in the command line and sent to Nomad in JSON format via the HTTP API.
So why not just JSON or even JSON at all and not MsgPack or just straight up HCL because that's over and over introduced as being machine readable and human friendly both at the same time?
I've only used Terraform, but I absolutely love HCL as a configuration language. I know I'm in the minority about this, but it's so much less fiddly and easier to read than json or yaml. I do wish there were more things that used it.
JSON is fine for serialization, but I hate typing it out. There are too many quotes and commas - all keys and values have to be quoted. The last item in a list or object can't have a trailing comma, which makes re-ordering items a pain. Comments aren't supported (IMO the biggest issue).
YAML is too whitespace dependent. This is fiddly and makes copy-pasting a pain. I'm also not sure how it affects serialization, like if you want to send yaml over the network, do you also have to send all of the whitespace? That sounds like a pain. OTOH I like that quotes are usually optional and that all valid json is also valid YAML. Comments are a godsend.
HCL has the same basic structure of objects and lists, but it's much more user friendly. Keys don't need to be quoted. Commas aren't usually needed unless you compress items onto the same line. It supports functions, list/object comprehensions, interpolations, and variable references which all lead to more powerful and DRY configuration. Granted I'm not sure if these are specific to TF's HCL implementation, but I hope not.
For serialization, HCL doesn't have any advantage over JSON. Sure it's machine-readable but probably much harder to write code that works on HCL specifically than to convert to JSON and use one of the zillions of JSON libraries out there.
JSON was designed for machine readability, HCL was designed for human readability.
HCL requires a lot more code to parse and many more resources to keep in memory vs JSON. I think it completely makes sense to do it this way. K8s is the same. On the server it does everything in JSON. Your YAML gets converted prior to sending to K8s.
To my understanding, you can write most (all?) of the job/config files in JSON if you wish. At my company, we have a ton of HCL files because in the beginning it was easier to hand-write them that way, but we're now getting to the point where we're going to be templating them and going to switch to JSON. In other words, I believe HCL is optional.
The important difference with k8s from my experience is that from the very early days it modeled a common IPC for any future interfaces, even if TPR/CRD tools some time to hash out. This means that any extra service can be simply added and then used with same general approach as all other resources.
This means you get to build upon what you already have, instead of doing everything from scratch again because your new infrastructure-layer service needs to integrate with 5 different APIs that have slightly different structure.
Probably less calculated and more "that's what's available to offer stably right now that we can feasibly deliver, so that's what we'll do." Distributed RDBMS were not exactly cheap or common in open source a couple decades back. I don't think there was much of a choice to make.
I mean it is a trade off though. You cannot beat the speed of light. The further apart your database servers are, the more lag you get between them.
If you want a transactional, consistent datastore you are gonna have to put a lock on something while writes happen. And if you want consistency it means those locks need to be on all systems in the cluster. And the entire cluster needs to hold that lock until the transaction ends. If your DB’s are 100ms apart… that is a pretty large, non negotiable overhead on all transactions.
If you toss out being fully consistent as a requirement, things get much easier in replication-land. In that case you just fucking write locally and let that change propagate out. The complexity then becomes sorting out what happens when writes to the same record on different nodes conflict… but that is a solvable problem. There will be trade offs in the solution, but it isn’t going against the laws of physics.
> If you want a transactional, consistent datastore you are gonna have to put a lock on something while writes happen. And if you want consistency it means those locks need to be on all systems in the cluster.
FWIW, it's not as bad as that sounds. There are traditional locks, and there is optimistic locking. If a there are two conflicting transactions, a traditional lock detects this before it happens (by insisting a lock is obtained before any updates are done) and if there is any chance of conflict the updates are run serially (meaning one is stopped while the other runs).
Optimistic locks let updates run with lock or blocking at all, but then at the end they check if the data they depended on (ie, data that would have been locked by the traditional mechanism) has changed. If it has they throw it all away. (Well, perhaps not quite - they may apply one of the conflicting updates to ensure forward progress is made.) The upside of this is there is if there are no conflicting updates everything runs at full speed - because there is no expensive communication about why has what lock going. The downside is a lot of work may be thrown away by what amounts to speculative execution.
Most monolithic databases use traditional locking. Two CPU's in the same data centre (or more likely on the same board) can rapidly decide who owns what lock, but cycles and I/O on a high end server are precious. Distributed ACID databases like spanner, cockroachdb and yugabytedb favour opportunistic because sending messages half way across the planet to decide who owns what lock before allowing things to proceed takes a lot of time, whereas the CPU cycles and I/O's on the low end replicated hardware are cheap.
While opportunistic locks allow an almost unlimited number of non-conflicting updates to happen concurrently, their clients still have to pay a time penalty. The decision about whether there was a conflicting update still has to be made, and it still requires packets to cross the planet, and while all this happens the client can't be sure if their data has been committed. But unlike the traditional model, they are never blocked by what any other client is doing - providing it doesn't conflict.
Yes, but my point was there wasn't really a choice to make at that time, therefore no trade off.
Even if I won $100 in the lotto today and had the money in hand, I wouldn't describe my choice which house I bought years ago as a calculated trade off between what I bought and some $10 million dollar mansion. That wasn't a feasible choice at that time. Neither was making a distributed RDBMS as an open source project decades ago, IMO.
MySQL replication isn't really what I would consider a distributed RDBMS in the sense we're talking about, but it is in some senses. The main distinction being that you can't actually use it as a regular SQL interface. You have to have a primary/secondary and a secondary can't accept writes (if you did dual primary you had to be very careful about updates), etc. Mainly that you had to put rules and procedures in place for how it was used in your particular environment to allow for sharding or multiple masters, etc, because the underlying SQL system wasn't deterministic otherwise (also, the only replication available then was statement based replication, IIRC).
More closely matching would be MySQL's NDB clustered storage engine, which was released in late 2004.[1] Given that Postgres and MySQL both started ab out 1996, that's quite a time after initial release.
I spent a while in the early to mid 2000's researching and implementing dual master/slave or DRBD backed MySQL HA systems as a consultant, and the options available were very limited from what I remember. There's also probably a lot better tooling these days for developers to make use of separated read/write environments, whereas is seemed fairly limited back then.
At the moment you need massive sharding options with Postgres, you've got a lot of options. I'd assume by the time you get there, you can probably budget for them pretty easily as well.
Citus is specifically designed for it.
Timescale appears to have the use case covered too.
Ultimately though, sharding is one of the easier ways to scale a database out. NoSQL just does it by eliminating joins. You can do that pretty easily with almost any solution, PG or otherwise.
The comparison is a bit misleading. As someone that has used both nomad and k8s at scale --
- Nomad is a scheduler. Clean and focused. It is very fast. I was an early user and encountered a number of bumps, but that's software. The people at Hashicorp are super sharp and lovely.
- K8s is a lot more. It includes a scheduler, but in the simplest sense, it is a complete control-plane based on the control-loop pattern. You have an API, a scheduler, a db, various controllers, etc. Forget for a moment that most people use it to orchestrate containers -- it's really designed to orchestrate anything. Its API is extensible, you can add and compose controllers -- there are many possibilities once you wrap your head around it.
This is all opinionated and includes a lot of capability. It's just very different.
You can stitch together nomad, consul, vault, and various glue to create a container orchestration system... but when you start wanting to manage the control-planes as though they are the "kind" (the container for example) with meta-control-planes, and you start wanting to orchestrate network, storage, and other dependencies... all while doing this in a multi-tenant environment, then things get interesting.
K8s itself is divided in multiple parts, where you can customize to your own liking, and you can swap parts if you'd like as long as the APIs are similar.
kube-apiserver and kube-controller-manager are two parts where I’m not aware of any other implementations but a) they are kind of the heart of k8s and b) can be easily extended with CRDs/operators.
This is so far from the truth, I'm having a hard time imagining why you even think this. Kubernetes is practically the distributed embodiment of the Unix philosophy. You have a core set of interfaces and components that need to offer a particular API, and other than that, whether it is one program or many, written by one developer or hundreds, by a private company or via volunteers contributing to open source projects, is totally up to how you want to do it. You're free to use the original reference implementation that used to be owned by Google a decade ago before they open sourced it and donated it to the CNCF, but you certainly don't have to. Others have mentioned k3s, which is the busybox to the reference kubernetes GNU coreutils, all Kubernetes, plus ingress and network overlay, in a single binary, with am embedded sqlite db as the backing store instead of etcd. But k3s is still "Kubernetes." Kubernetes is a standard, much like POSIX. It's maybe unfortunate that the original reference implementation is also also named "kubernetes" because a lot of people seem to think that one is the only one you can use, and it has historically been complex to set up, but the reason for the complexity is it doesn't make any choices for you.
Imagine if you wanted to use a Unix operating system, but instead of choosing a Linux distro, you just read the POSIX standard and went out and found every required utility, plus a kernel, and had to figure out on your own how to get those to work together and create a system that can run application-level software. If you just go to kubernetes.io and follow the instructions on how to get up and running with the reference implementation, that is what you're doing. It makes no decisions at all for you. You can run external etcd, or use kubeadm to set it up for you. You can run it HA or on a single node. You can add whatever overlay network you want. You can use whatever container runtime engine you want. You can use whatever ingress controller you want, or none at all, and not have any external networking, just as you can install Linux From Scratch and not even bother to include networking if you want a disconnected system for some reason.
You have pretty much complete user freedom, and that is, in fact, the source and reason for a whole lot of complaints. Application developers and even most system administrators don't want to have to make that many decisions before they can even get to hello world. I believe Kelsey Hightower commented on this a while back, saying something to the effect that Kubernetes is not meant to be a developer platform. It's a framework for creating platforms.
Application developers, startups, and small business should almost never be using Kubernetes directly unless they're actually developing a platform product. Whether you use a "distro" like RKE2 or k3s or a managed service from a cloud provider, building out your own cluster using the reference kubernetes is the modern day equivalent of deploying a LAMP stack but doing it on top of Linux From Scratch.
> Despite its reputation, Kubernetes is actually quite easy to master for simple use cases. And affordable enough for more complex ones.
Are you referring to actually spinning up and operating your own clusters here or utilizing managed k8s (e.g. GKE/EKS)?
In my understanding, sure - using the managed services and deploying simple use cases on it might not be that big of a deal, but running and maintaining your own k8s cluster is likely far more of a challenge [1] than Nomad as I understand it.
Kubernetes the hard way is an educational resource, meant to server as a resource for those interested in deep diving into the platform, similar to Linux from scratch. Like the statement on the label:
> Kubernetes The Hard Way is optimized for learning, which means taking the long route to ensure you understand each task required to bootstrap a Kubernetes cluster.
> The results of this tutorial should not be viewed as production ready, and may receive limited support from the community, but don't let that stop you from learning!
If you want to spin up cluster that you actually want to use you'll pick one of the many available free or paid distros, and spinning up something like k3s with rancher, microk8s or even the pretty vanilla option of kubeadmin is pretty simple.
Agreed. With k3s and friends, nomad is more complicated and requires other components (Consul, some kind of ingress) to match what you get out of the box.
Running a production ready, high availability Kubernetes cluster with proper authn, authz, and resource controls is about the farthest thing from "simple" that I can imagine.
Profoundly disagree. In fact, your statements are not true, since before you can run a pod you will most probably need to create a deployment. And then you'll need a service to expose your workload.
Anything but the most trivial workloads will also lead you into questions of how to mount volumes, cijfure) use configmaps and secrets, etc.
And that's not even touching the cluster configuration, which you can skip over if you are using a cloud provider that can provision it for you.
They're both complex. But one of them has 10 times the components than the other, and requires you to use them. One of them is very difficult to install - so much so that there are a dozen different projects intended just to get it running. While the other is a single binary. And while one of them is built around containers (and all of the complexity that comes with interacting with them / between them), the other one doesn't have to use containers at all.
> But one of them has 10 times the components than the other
I've said this before. Kubernetes gives you a lot more too. For example in Nomad you don't have secrets management, so you need to set up Vault. Both Nomad and Vault need Consul for Enterprise set ups, of which Vault needs 2 Consul clusters for Enterprise setups. So now you have 3 separate Consul clusters, a Vault cluster, and a Nomad cluster. So what did you gain really?
Kubernetes' secrets management is nominal at best. It's basically just another data type that has K8S' standard ACL management around it. With K8S, the cluster admin has access to everything, including secrets objects. It's not encrypted at rest by default, and putting all the eggs in one basket (namely, etcd) means they're mixed in with all other control plane data. Most security practitioners believe secrets should be stored in a separate system, encrypted at rest, with strong auditing, authorization, and authentication mechanisms.
It's "good enough" for most and extension points allow for filling the gaps.
This also dodges the crux of GP's argument -- instead of running 1 cluster with 10 components, you now need a half dozen clusters with 1 component each, but oops they all need to talk to each other with all the same fun TLS/authn/authz setup as k8s components.
I'm a little confused. Why does the problem with K8S secrets necessitate having multiple clusters? One could take advantage of a more secure secrets system instead, such as Hashicorp Vault or AWS Secrets Manager.
The point is that once you're talking about comparable setups, you need all of Vault/Nomad/Consul and the complexity of the setup is much more than just "one binary" as hashi likes to put it.
> So now you have 3 separate Consul clusters, a Vault cluster, and a Nomad cluster. So what did you gain really?
GP's point was already talking about running Vault clusters, not sure you realized we aren't only talking about nomad.
The only thing I was trying to say is that although K8S offers secrets "for free," it's not best practice to consider the control plane to be a secure secrets store.
That's false. Vault has integrated storage and no longer needs Consul.
If you want to have the Enterprise versions( which aren't required), you just need 1 each of Nomad, Consul, Vault. Considering many people use Vault with Kubernetes anyway(due to the joke that is Kubernetes "secrets"), and Consul provides some very nice features and is quite popular itself, that's okay IMHO. Unix philosophy and all.
This is just false. I've run Vault in an Enterprise and unless something has changed in the last 12 months, Hashicorp's recommendation for Vaul has been 1 Consul cluster for Vault's data store, and 1 for it's (and other application's) service discovery.
Sure Kubernetes's secrets is a joke by default, it's easily substituted by something that one actually considers a secret store.
It's new but I think is quickly becoming preferred. I found trying to setup nomad/consul/vault as described on the hashi docs creates some circular dependencies tbh (e.g. the steps to setup nomad reference a consul setup, the steps for vault mention nomad integration, but there's no clear path outside the dev server examples of getting there without reading ALL the docs/references). There's little good docs in the way of bootstrapping everything 1 shot from scratch in the way most Kubernetes bootstrapping tools do.
Setting up an HA Vault/Consul/Nomad setup from scratch isn't crazy, but I'd say it's comparable level to bootstrapping k8s in many ways.
Cool, so that's certainly new. But even then, you're dealing with the Raft protocol. The different is it's built into Nomad compared to Kubernetes where it's a separate service. I just don't see Nomad and Co being that much easier to run, if at all.
I think Nomad's biggest selling point is that it can run more than just containers. I'm still not convince that's it's much better. At best it's equal.
> you're dealing with the Raft protocol. The different is it's built into Nomad compared to Kubernetes where it's a separate service
I don't really follow this. etcd uses raft for consensus, yes, and it's built in. Kubernetes components don't use raft across independent services. Etcd is the only component that requires consensus through raft. In hashi stack, vault and nomad (at least) both require consensus through raft. So the effect is much bigger in that sense.
> I think Nomad's biggest selling point is that it can run more than just containers. I'm still not convince that's it's much better. At best it's equal.
Totally agree. The driver model was very forward looking compared to k8s. CRDs help, but it's putting a square peg in a round hole when you want to swap out Pods/containers.
It's not that circular - you start with Consul, add Vault and then Nomad, clustering them through Consul and configuring Nomad to use Vault and Consul for secrets and KV/SD respectively. And of course it can be done incrementally ( you can deploy Nomad without pointing it to Consul or Vault, and just adding that configuration later).
I don't mean a literal circular dependency. I mean the documentation doesn't clearly articulate how to get to having all 3x in a production ready configuration without bouncing around and piecing it together yourself.
So I need vault first. Which, oops, the recommended storage until recently for that was Consul. So you need to decide how you're going to bootstrap.
Vault's integrated Raft storage makes this a lot nicer, because you can start there and bootstrap Consul and Nomad after, and rely on Vault for production secret management, if you desire.
Also, Kubernetes can be just a single binary if you use k0s or k3s. And if you don't want to run it yourself you can use a managed k8s from AWS, Google, Digital Ocean, Oracle...
> Both Nomad and Vault need Consul for Enterprise set ups, of which Vault needs 2 Consul clusters for Enterprise setups. So now you have 3 separate Consul clusters, a Vault cluster, and a Nomad cluster.
This is incorrect. You don’t need consul for enterprise. Vault doesn’t need two consul clusters (it doesn’t need consul at all, if you don’t want it)
IIUC, despite K8s having started at Google by Go enthusiasts who had good knowledge of borg, the goal has never been to write a borg clone, even less a replacement for borg.
And after so many years of independent development, I see no reason to believe that K8s ressemble borg any more than superficially.
This seems to be very much assumed by kubernetes authors. Current borg users please correct me if I'm wrong.
I believe that the one that requires containers is Kubernetes. Nomad doesn't require containers, it has a number of execution backends, some of which are container engines, some of which aren't.
Nomad is the single binary one, however this is a little disingenuous as Nomad alone has far fewer features than Kubernetes. You would need to install Nomad+Consul+Vault to match the featureset of Kubernetes, at which point there is less of a difference. Notwithstanding that, Kubernetes is very much harder to install on bare metal than Nomad, and realistically almost everyone without a dedicated operations team using Kubernetes does so via a managed Kubernetes service from a cloud provider.
The fundamental abstractions are as simple as they can be, representing concepts that you'd already be familiar with in a datacenter environment. A cluster has nodes (machines), and you can run multiple pods (which is the smallest deployable unit on the cluster) on each node. A pod runs various types of workloads such as web services, daemons, jobs and recurring jobs which are made available (to the cluster) as docker/container images. You can attach various types of storage to pods, front your services with load-balancers etc.
All of the nouns in the previous paragraph are available as building blocks in Kubernetes. You build your complex system declaratively from these simpler parts.
When I look at Nomad's documentation, I see complexity. It mixes these basic blocks with those from HashiCorp's product suite.