Saturday 8 April 2023, 15:30PM
Over 400 years ago, Shakespeare's Hamlet pondered:
To use Kubernetes, or not to use Kubernetes, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take Arms against a Sea of troubles, And by opposing end them: to die, to sleep
Ok, maybe he didn't say that, but you get the point. Kubernetes has been rapidly and steadily gaining traction ever since its first release over 8 years ago, and by now every dev who hasn't been living under a rock has heard of it. It's the second most beloved and wanted of 2022, and is quickly becoming (if not already) The Next Big Thing. Yet, it's also notorious for being one of those complicated and overkill tools that people jump onto just for bragging rights, despite not needing them.
A couple of months ago, I finally started using Kubernetes myself at work - being the first one in the company to try moving parts of our platform onto it. Here are the gains, pains, and learnings I gathered from the process.
At the point of the decision to start using Kube, we were a small engineering team (less than 10) with 1 dedicated SRE and about half a dozen microservices running on ECS (Fargate) in AWS. I won't debate the merits of going the microservices route for teams that small here, but the gist of it was that we knew that number was going to rise pretty quickly over the next few months as we reworked parts of our internal architecture. We identified a couple of reasons why we didn't think we were going to be happy with ECS long-term:
In addition, our company was also about to start running a production site and warehouse, which needed some custom software. This meant running a local, physical server there for resilience purposes. Whilst we could have easily gone the route of non-container-based solutions (like running applications directly on the bare metal, or packaging a VM image), we wanted a deployment process that was as isomorphic to our cloud one as possible. This left two choices - using bare Docker, or Kubernetes. Bare docker would have likely involved plenty of custom scripts, and with the industry moving away from bare docker to Kubernetes anyways (plus the possibility of running multiple physical machine nodes in the future) it was clear Kubernetes was going to be the choice for the on-prem server.
In which case, every argument was pointing towards Kubernetes for both cloud and on-prem. Hence, our Kubernetes journey started...
Most importantly, there's loads of documentation out there for Kubernetes. If you've ever worked with a vendor-specific cloud service and thought "these docs are terrible", then Kubernetes is, generally speaking, the opposite experience. The official documentation is thorough, and the millions of third party tutorials, explanations, and videos out there on the web are a big upgrade over the docs of an AWS Elastic Pick-a-Service (or any other cloud service, really). And whilst I was worried that Kube's rapidly growing nature would mean any source of info older than a year would be outdated or misleading, this wasn't actually the case - I found plenty of stuff from 2018 that was still useful, which is more than I can say about a lot of other tech that I work with.
One thing I would say, was that the official tutorial (which uses minikube) didn't feel that useful. I'm not sure what could've been done better about it, but typing in provided commands to spin up a minimum viable cluster wasn't a particularly helpful learning experience. But again, it's probably impossible to have a good tutorial that captures the real experiences of deploying multiple deployments, stateful sets, services and load balancers in a real multi-node cloud - such a tutorial would scare off newbies.
One of my peeves about ECS (like with most AWS services) was that you only had two ways to use it (ignoring IaaC), which were either the AWS CLI or the web console. The web console in my opinion was pretty poorly designed, which was extra frustrating considering they could have copied one of the dozens of well-designed container deployment visualizers out there. With Kubernetes, you get to use those tools. And your coworkers can also use those tools, or use other tools if they prefer. The large amount of choices are wonderful - I've personally settled on using k9s, but have also tried using Portainer, Lens as well as the Kubernetes dashboard itself.
In addition to that, there are also great meta-tools like Helm and Kustomize, as well as integrations with other IaaC that we already use like Terraform and Terraform CDK. Various Infrastructure Access Platforms are also out there, like Infra or Teleport Teleport for making dev access and authentication easier. If you haven't already, I would check here for a really good list of Kube-related resources and tooling.
As I stated before, Kubernetes has a reputation of being complex and difficult. After having used it, I don't think that's quite true. If your microservices stack largely consists of web applications and maybe a few stateful sets like message queues or a database, then I think it's actually pretty easy. For each service, all you need is usually:
Each of those things is a couple of lines of configuration in IaaC, which you can reduce even more by reusing boilerplate between applications. Generally speaking, it's pretty comparable to any other IaaC solution for non-Kubernetes deployments. There are also some other big notable advantages here:
Where Kubernetes gets difficult, is also where any other tooling stack would get difficult. Need to run thousands of physical nodes together, with complex service mesh needs? Or GPU passthrough for intense machine learning tasks? There are plenty of things that are "difficult" with Kubernetes, but they are probably equally, if not more difficult without it. Kubernetes is a container orchestration platform, so it's really good at managing containers - but it won't automagically solve all your other difficult infrastructure problems. If you're looking for something to solve all your problems, then you need Jesus, not Kubernetes.
The definition of cloud native is as follows:
Cloud-native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds.
It sounds like one of those bullshit-sounding SEO buzzwords, but it essentially boils down to being designed to be deployed to a cloud, first and foremost. However, there's some subtleties around its meaning which I didn't appreciate prior to working with Kube:
Amazon's managed Kubernetes service, EKS is what we ended up using, as we were already on AWS. As it turns out, EKS is generally considered to be the worst Kubernetes provider out of all the major players. I've experienced two major flaws with it so far:
Aside from that, I haven't faced anything too critical, but looking at the memes from the Kubernetes subreddit has made me nervous about EKS in general...
This one is an honorary mention because it's not something I've had to deal with (yet). But apparently, upgrading Kubernetes versions can be a complete nightmare. Reddit recently had an outage because of it, and whilst I appreciate that most people reading this are probably not working at that scale, it is nonetheless a scary thought.
Honestly, that's up to you. If your infrastructure is already at a multi-service level and growing rapidly, and you are happy to spend some time digging into it, then there's little reason not to at least try building out an MVP/experimental cluster. There's a reason Kube has become so popular these last few years - it does make lots of things easier in the long term.
That said, don't feel like you need to switch over just for the sake of it. The best piece of technology is one that works, and if whatever you're using now works, is there truly a need to spend lots of time and effort switching to something else? Plus, it's not a decision that you have to make once. You can always try Kubernetes again at a later time, and I suspect that it's highly likely to get even easier in the future, as it becomes an even more mainstream technology.
If you enjoy the above article, please do leave a comment! It lets me know that people out there appreciate my content, and inspires me to write more. Of course, if you really, really enjoy it and want to go the extra mile to support me, then consider sponsoring me on GitHub or buying me a coffee!