Original author from 5 years ago. Surprised to see this here 5 years later.
Docker really used to crash a lot back in the days, mostly due to buggy storage drivers. If you were on Debian or CentOS it's very likely that you experienced crashes (though a lot of developers didn't care or didn't understand the reasons the system went unresponsive).
There was notably a new version of Debian (with a newer kernel) published the year after my experience. It's a lot more stable now.
My experience is that by 2018-2019, Docker had mostly vanished as a buzzword, people were only talking about Kubernetes and looking for kubernetes experience.
edit: at that time Docker didn't have a way to clear images/containers, it was added after the article and follow up articles, I will never know if it was a coincidence but I like to think there is a link. I think writing the article was worth it if only for this reason.
> Docker had mostly vanished as a buzzword, people were only talking about Kubernetes and looking for kubernetes experience.
I don't know if we are doing it wrong at my job but... I feel like Kubernetes is just a way to orchestrate Docker images running as pods (amongst many other things I'm sure).
I know Kubernetes doesn't require Docker images but... how many shops are using k8s without Dockerfile build + tag + publish of images for containers?
Containerd is the docker runtime, so it's still used to run/manage containers on the kubelet. Crio is another option: https://kubernetes.io/docs/setup/production-environment/cont... . I think the distinction in the docs there is that with just containerd, you don't get all the docker tooling
My understanding is that docker was the most common, but I'm not actually sure if that's still the case. CRI is the interface k8s uses to manage the container engine, and I think CRI-O is the most lightweight, but also gives you less/worse tooling for debugging if you need to drop into a kubelet.
The images are just OCI-spec now, but if memory serves, Docker used a different image format in the past, and it was another containerization company (coreos/rkt) working on an open standard which eventually become OCI
Yes, you use OCI images. The OCI format is a specification for container images based on the Docker Image Manifest Version 2, Schema 2 format.
Now, there are multiple ways to build these images, some which are better than using Dockerfiles, though mostly people still use Dockerfiles.
Hey, always wondered: how come you tried to store the DB data on the docker’s system disk instead of the external volume? This would bypass all the driver stability problems, make the data recoverable trivially, decrease cpu usage, etc...
I know that there are posts from 2016 explaining how to do so, but the article had no mention of this?
That seems to be a common misconception. I don't know why anybody thought I wasn't aware of external volumes, I was.
It made little difference, the entire ecosystem was highly unstable. Containers could still fail, the docker daemon could hang, and the host could kernel panic any minute. For databases that meant downtime and potential data corruption.
Besides, there is a major use case to run temporary databases for CI testing. I remember a lot of issues when running performance tests or seeding the database with initial data, basically anything that is performance intensive. I think the unstable filesystem played a role but it was far from the only root cause.
Honestly it's been 6 years now and I've never seen a company running any critical databases inside docker (but I've seen it for testing). I've known a lot of companies that said they did or they had plans to move existing databases, but that wasn't real. At the end of the day there was no sysadmin/DBA/devops who would do it, they understood that it would come back to bite them (many had enough troubles just running ephemeral web services). Maybe it's the harder part to grasp but database is really a different mindset from web development, you cannot risk losing customer data, this would be an extinction level event for your job and for the company.
We run almost all of our DBs - pg, cockroachdb, clickhouse and etcd/kafka/redis (if you can consider those a database) inside docker/crio inside k8s. In production under high load. Works really well. We’ve had more crashes of the db itself than anything container/node related
If you mean using Docker containers, well those are pretty stable. There are hundreds of companies running ClickHouse on Kubernetes, which deploys using Docker containers. Some of them run on very large K8s clusters. We've seen very few problems.
The one issue I've seen specific to Docker is that you can run into configuration errors that keep them from coming up. That happens occasionally and it can be tricky to debug.
Disclaimer: My company Altinity wrote the ClickHouse Kubernetes Operator, which is also a Docker container.
The article is 5 years old, your operator is 3 years old. 5+ years ago docker was different. I had seen stability issues too but they got less and less by time until everything was running without any problems. Sometimes even a reboot was needed because the kernel was misbehaving.
> Honestly it's been 6 years now and I've never seen a company running any critical databases inside docker (but I've seen it for testing)
Oh. Maybe I misread--the above point seemed to be referring to the present, which is what I was addressing. 6 years ago is a different matter entirely. Heck, operators didn't exist then either. [1]
Even in 2016, I had been running production services in Docker successfully. Its interesting to me that they see the problem "Docker isn't designed to store data" without also seeing the solution "the docker copy-on-write filesystem isn't designed to be written to production- but volume mounts are". I hadn't seen docker crashing hosts (still haven't) - but I'm guessing that was caused by using the storage drivers.
The complaints about their development practices are valid (and haven't really improved), but even then the technology worked well so long as you understood its limitations.
Our big project has moved from physical servers to Openshift. Its taken a lot of work, much more than expected. The best thing is that developers like it on their resume, which is a bigger benefit than you'd think as we've kept some good people on the team. For users I see zero benefit. CI pipeline is just more complicated and probably slower.
Cost wise it was cheaper for a while but now RedHat are bumping up licensing costs so now I think is about the same costs.
Overall it seems like a waste of time, but has been interesting.
My gut feel is that Docker is part of a trend of decreasing software quality.
When someone writes "fixed dependencies" I read "developers can more easily add more bloat before the cardhouse tumbles". That happens for example when the "fixed dependencies" are upgraded.
I am miserable having to touch all this junk. I feel a project is right when I can just git clone it (a few megabytes of data at most) and am left with a self contained repo that was written with minimal dependencies (optimally stored in-tree), and that can be easily built in seconds with a simple shell script on any reasonably modern system.
The bare bones way takes a good amount of initial work, but mostly it's a learning experience. Once one understands a few principles of writing portable software, I'm sure it saves a huge amount of time compared to adding all these shells of junk.
--
Oh yeah, I have zero experience about integrating with Kubernetes or whatever. I've been a small time user of Jenkins and CircleCI (unvoluntarily), and when I don't have to set it up and it actually works, it's alright and can help where the developer maybe lacks a bit of discipline (build all targets, run all the tests).
But, I doubt these technologies are a replacement for an ergonomic build environment (with simple python build script or even a crude Makefile). Is incremental building a thing on any of this CI pipelines? Because one thing I want is building really really fast, and it's already way too much overhead if I have to go through a git commit to check this stuff. Don't even think about requiring a full rebuild or Docker image build just to get some quick feedback on a code change.
> The bare bones way takes a good amount of initial work, but mostly it's a learning experience.
If my prod environment disappears over night I want to be able to restore everything as fast as possible.
Otherwise my boss will be very unhappy and I will be promoted to customer :D.
> I feel a project is right when I can just git clone it (a few megabytes of data at most)
I don't know your experience but especially small projects often have some difficulties installing all dependencies correctly. The fun starts when you don't run a widely supported distro like Ubuntu.
If I want to run an old version of the production software I pull the image version X and execute it.
I know that everything in the package has the right version and works like it used to.
Tests have been executed and the version in the image has proven to meet the standards back then.
> But, I doubt these technologies are a replacement for an ergonomic build environment
If you build software for larger customer bases you will most likely encounter some kind of clustering.
Most large companies have already adapted this way. For personal and small projects it certainly is overkill.
> get some quick feedback on a code change
...wait, do you deploy to prod without tests and no direct way to role back the changes?
I think half the point is that its best not to have all the unusual packages. Stick with the regular ones and you should have a much easier life that you dont even need complex tools to manage the dependencies.
Docker pushes developers to think about deployment, not just getting it running on their laptop. And responding to your question: yes - the Docker build process is inherently incremental.
Moving from classic servers to containers you get:
- Builds with fixed dependencies that never change. Rollback is easy -> what about VMs?
- Easy deployment of a prod environment on a local machine -> yep, that's a nice touch, the only valid point for me!
- Fast deployment -> lol no, Im faster with VMs.
- Easy automation (use version X with config Y) -> valid for VMs and baremetal too
With Kubernetes (or other derivates like Openshift) you get:
- Auto scaling -> you can get it with VMs too
- Fail over -> you can get it with VMs too
- Better resource usage if multiple environments are executed -> you can get it with VMs too
- Abstraction of infrastructure -> Should I really write it?
- Zero downtime deployment (biggest point for my company, we deploy >3 times per week) -> We do on some specific DC (government style) and we release 10-15 times and day with Bare metal servers and ansible
There are applications that do not need Kubernetes or even containers, but is this list really nothing oO? -> None of the arguments convinced me
I can imagine that if you use Kubernetes just like a classic cluster it could seem like an unnecesarry added complexity but you gain a lot of things. -> yes, extra cost and extra skills needed
Each of those benefits are things I had before using containers or kubernetes, and were simpler.
> Builds with fixed dependencies that never change. Rollback is easy
Any good build system already did this, such as Bazel, or a Gemfile.lock. We'd just snapshot AMIs to keep OS dependencies fixed... which is what Docker images effectively do. If you re-docker-build the same Dockerfile, it's not like you get the same result of "apt-get install libxml" the next time either.
> Easy deployment of a prod environment on a local machine
How containers are deployed varies wildly between prod and the local machine. All the things that were hard before are still hard. Things like secrets and external dependencies still usually vary.
If prod is a kubernetes environment, getting a suitable k8s environment setup locally sucks, especially since it will probably have a different ingress controller, load balancer setup, storage classes available, resource requests, etc.
If prod is kubernetes and local is docker-compose, that honestly seems like just as much work to create a second way to run the stack than just using a bash script + "npm start" or "bundle exec rails server" or whatever.
Either way, it's not really a prod environment. It's hard to run identical-to-prod environments locally, and those problems are related to secrets and clouds and such, not due to the lack of containers, in my experience.
> Fast deployment
In my experience, containers haven't sped up deployment. Let's say you use ubuntu for your host and container's OS. Before containers, this meant you had to download one version of libssl ever, and that was it. If there was an update to libz, that didn't require a new download of libssl. After containers, if you build your container for app1 last week, and your container for app2 today, the "FROM ubuntu" likely resolves to a different image. Both your apps now have different "ubuntu" layers, which probably have the same version of libssl, but deduplication of downloads only happens if the whole layer is identical.
In essence, we went from downloading 1 copy of libssl (for the host OS only) to 3 copies (host OS + 2 containers w/ different ubuntu bases), and there's no deduplication.
That by itself seems like it has to be slower since there's an inherent increase in network bandwidth that has to happen. Even if you have a shared base image, you're at least doubling the downloads of libssl since before you could use the host's copy only.
All the items you listed under k8s are things I had before it, excluding "Abstraction of infrastructure". Frankly, if you have a well-made load balancer, it's hard not to have zero-downtime deployments and auto-scaling.
> We'd just snapshot AMIs to keep OS dependencies fixed
This is a good solution, but I would not call it easier.
Using docker container feels like installing an app on my smartphone.
I choose the version and it will always work like I build it at date x without an additional system.
Works for every programming language with every dependency out of the box.
Python, Java, Javascript, GO, Ocaml, C, ...
> How containers are deployed varies wildly between prod and the local machine
I just brought a product of my company to Kubernetes.
Run helm upgrade --install . -f dev-values.yaml for dev
Run helm upgrade --install . -f prod-values.yaml for prod (of course you need the secrets there. Jenkins has them).
My laptop does run an environment with all components of the prod env.
Something like email and sap services are of course mocked, but everything else?
All on my machine. Why not?
I can spin up a new test environment for customers with new settings on the same day.
> Both your apps now have different "ubuntu" layers
We use a base image that does change not that often. Even if: no problem, the registry is connected via 1000 MBit/s and zero-downtime deployment does its magic so I don't even notice if it takes one or two minutes.
Another thing: my node (or VM) libs and the libs of my software should not be connected in any way (at least for me). I want to patch my nodes and my software independently. Different software should also not be bound to libs of another software.
> All the items you listed under k8s are things I had before it
- How do you easily scale up? Including starting new machines and spinning down machines that are no longer needed
- How are multiple software parts executed on one host?
- How do you do fail over?
I know that everything can be done without Kubernetes. With enough time and money one can create large systems that do this.
I spun up a new Kubernetes cluster and ported our product (already containerized) on the cluster in about three months.
Really: I also love the classic dev ops and have a proxmox server at home, but Kubernetes just solves many problems at once in a short time.
> How do you easily scale up? Including starting new machines and spinning down machines that are no longer needed
AWS autoscaling groups + cloudwatch for adding and removing machines + checking them into load balancers is something that has worked for longer than K8s has been a thing.
> How are multiple software parts executed on one host?
systemd units, or for more resource hungry things, multiple autoscaling groups.
The overhead of running the kubelet on each host + etcd cluster + apiserver means that I still end up with fewer hosts if I just run each component on every single host vs scaling different deployments independently.
It is true that kubernetes might be more resource efficient in some combination of nodes and software, but at under 10 servers, I've always found the overhead of the etcd cluster + apiserver + kubelet to dwarf any savings from not just running 10 copies of my software.
> How do you do fail over?
The AWS-managed load balancer can fail over based on health checks failing, metrics, or I can add/remove servers from it manually. You can also do DNS health checks, or add a layer of haproxy/nginx/whatever if you want.
It's not like k8s has some magic ability to fail over under the hood. It's just using k8s service objects (probably LoadBalancer type), which does the same thing.
Correct. We use the pretty much the same setup on GCP. All scaling is automatic. When we deploy new code we just run a jenkins job that creates an image from custom debian packages. Push that to GCP and it rolls it out automatically to all our DCs.
... not to be disrespectful but this seems to contain quite a high amount of vendor specific implementations. May very well be because I don't know the solution in detail, just seems like it to me.
In the end you are bound to AWS and have to reinvent many things when moving to another cloud provider.
How long did it take to set this up?
> AWS autoscaling groups + cloudwatch for adding and removing machines
Does this also work for multiple applications on one host?
> means that I still end up with fewer hosts if I just run each component
I don't know how your environments looks like but we used one VM for every environment. One environment needs about 20 GB max so we have to use 32 GB RAM VMs and waste quite some resources.
With k8s I have two beefy 64 GB nodes that host 6 environments.
This also speeds up the execution as 70% of time the environments have a low load and the beefy nodes have more CPU cores available than smaller VMs.
Most of the time we can also throw some Jenkins and other test jobs on the cluster for free (low priority deployments).
> Overhead
Yes there is definitely some overhead. 2 GB RAM for the master and 700 MB RAM on every node.
We choose larger nodes (8 CPUs and 64 GB RAM) so the overhead is not that great in comparison.
We do gain savings from k8s (about 15 to 30 %).
> k8s magic
You are right, there is no magic sauce. k8s just packs a nice package of things I otherwise need to do externally.
Sure the external logic worked well in the past, but now I get many features for a rather low price without committing my company to a provider to hard.
> seems to contain quite a high amount of vendor specific implementations
The "vendor specific implementation" is the loadbalancer and the autoscaling group. If you want to switch to baremetal servers, or some other cloud, getting a loadbalancer setup with similar properties is easy. Every major offering (GCE, Azure, HAProxy on baremetal, etc) will handle healthchecks and dynamically adding/removing servers.
Autoscaling servers based on metrics is admittedly more provider specific, but kubernetes doesn't solve that either. If I have k8s on 5 bare metal hosts, there's no cloud-agnostic way for k8s to magically launch a 6th host. You need to solve the same problem in either case. The k8s cloud autoscaler exists, and so do similar features for most clouds.
Kubernetes feels like even more lock-in than I have. If you were unlucky enough to run your services on docker swarm, mesos, triton, or various other container-running abstractions, then you'll have had to deal with a painful migration to kubernetes (since each of those technologies clearly 'lost'). Moving off kubernetes to another abstraction that runs containers (like mesos or such) is much more painful than moving from an AWS load balancer to a GCE load balancer or a bare-metal L5 load balancer.
In that sense, I feel like I have less lock-in than the typical K8s setup.
> [the rest of your comment]
It sounds like in your situation, k8s is working decently well for you. Congrats.
My point is not that k8s is always bad or using plain VMs + a load balancer is inherently superior, but that neither is clearly better, and both have benefits in different situations. In yours, k8s seems fine. In some other situations, k8s is a net negative.
Then I can understand you well. Kubernetes then just provides zero-downtime and additional complexity. When you already have something like a deployment window (like 2 am to 3 am) then ZDT also does not matter.
OpenShift is about 10x more complex than basic Docker / containers, and probably 2-4x more complex than plain old Kubernetes.
I've seen more success from organizations running smaller K3s or K8s clusters (if they need the orchestration) or just running small apps via Docker/Docker Compose separately, using a CI system (even as simple as GitHub Actions) to manage deployments.
Yeah its also a problem that our org has infrastructure teams that manage the openshift clusters and they are under resourced so dont help or often can't figure out how to fix problems. Linux sysadmins know what they're doing as the core infrastructure has been mostly the same for the last few decades.
Completely different experience at the current organisation I'm at. Openshift has provided a stepping stone off of on-prem VMs and into the cloud. They just migrated the on-prem cluster to a AWS cluster. Next step is to move off Openshift and into cloud native where possible. Really it has provided a path to discontinue legacy services and build out a more modern system/service fairly quickly.
I wasn't around when it was introduced so can't talk to the initial complexity/pain but the current team responsible for it is surprisingly small.
The responsibility for everyday use has been delegated to all teams which gives them a capability to build and deploy frequently. That's a real enabler.
Most of the things that make this work are not really technology based, good collaboration, sharing knowledge and practices and being able to ask and get help quickly but I do think Openshift itself isn't bad at all. It does appear to provide a very decent build, deploy and run platform.
It was year 2016, kubernetes did not have jobs, cronjobs, statefulsets. Pods would get stuck in terminating state or container creating state. Networking in kubernetes was wonky. AWS did not have support for EKS. It used be painful.
It is year 2021, 1000s of new startups around kubernetes, more features, more resource types. Pods would still get stuck in terminating state or container creating state. It still pretty painful.
I haven't seen these failure modes in 2021. We do managed clusters at work and have created around 100,000 of them, and basically all the pods we intend to start start, and all the pods we intend to kill die -- even with autoscaling that provisions new instance types. Our biggest failure mode is TLS certificates failing to provision through Let's Encrypt, but that has nothing to do with Kubernetes (that is a layer above; what we run in Kubernetes).
EKS continues to be painful. It has gotten better over the years, but it is a chore compared to GKE. I like to imagine that Jeff Bezos walked into someone's office, said "you're all fired if I don't have Kubersomethings in two weeks", and that's what they launched.
This year, I have seen those issues popping up in statefulsets alot. I realised somebody in the team was force deleting it. It is actually well documented.
I have seen few scenarios where people patching a statefulset actually screws up the volume mount. It is sometimes not evident where the error is for instance if it is the CSI or in the scheduler unless you deepdive into the issue.
The biggest pain point is having to manually use cloudformation to create node pools. This is especially irritating when you just need to roll the Linux version on nodes -- takes half a day to do right. In GKE, it's just a button in the UI (or better, an easy-to-use API), and you can schedule maintenance windows for security updates (which are typically zero downtime anyway, assuming you have the right PodDisruptionBudgets). I think AWS fixed that. I remember when I used it, they said they had some new tool that would handle that, but you had to re-create the cluster from scratch. This was a couple years ago, and is probably decent nowadays.
There are other warts, like certain storage classes being unavailable by default (gp3), the whole ENI thing for Pod IPs, the supported version being way out of date, etc. EKS has always felt like "minimum viable product" to me -- they really want you to use their proprietary stuff like ECS/Fargate, CloudFormation, etc. If you're already on AWS and want Kubernetes, it's just what you need. If you could pick any cloud provider for mainly Kubernetes, it wouldn't be my first choice.
Having used EKS, GKE, and DOKS, I definitely prefer GKE. GKE is very feature-rich, and the API for managing clusters works well. The nodes are also cheaper than AWS. (I use DOKS for my personal stuff and I haven't had any problems, and it is free, but it's missing features like regional clusters that you probably want for things you make money off of.)
For what it's worth, there's an off-the-shelf terraform module for EKS that is far simpler to use than AWS' cloudformation tooling, which does allow you to pass in a custom AMI and multiple nodegroup configurations as input parameters.
EKS has actually come a long way in the last couple of years. They now have 'eksctl' that helps a lot with provisioning/de-provisioning the node pools. They're also pretty up to date with supporting new versions of k8s (1.21.2 now, I think). I definitely know what you mean about the 'minimum viable product' and their overall views on k8s though, as our account rep definitely tried to steer us away from EKS when we ran into problems with CronJob. They also don't seem to have any particular relationship with k8s development, so we ended up not able to use CronJob for more than six months while our issue (with k8s repo) just sat waiting for someone to look at it.
You have probably thought about this already but I must admit I’m curious: If you’re on AWS, can you not use Certificate Manager instead of Let’s Encrypt?
I'm actually not on AWS, just used EKS extensively at my last job (and we still manually test our software against it).
AWS burned me hard with forgetting to auto-renew certs at my last job. It just stopped working, the deadline passed, and only a support ticket and manual hacking on their side could make it work. cert-manager has been significantly more reliable and at least transparent. The mistake we make right now is asking for certificates on demand in the critical path of running our app -- but since we control the domain name, we could easily have a pool of domain names and certificates ready to go. Our mistake is having not done that yet.
Certificate Manager pushes you (shoves you, really) in the direction of using AWS managed services. They make certificate installation/rotation really easy for their own services and unnecessarily difficult for any that you implement yourself.
(This may have changed in the last year or two, but it was certainly this way when I tried it.)
In my experience, it's difficult-to-impossible to use AWS' certificate management and LB termination in conjunction with Envoy-based networking like Istio or Ambassador.
Yeah, the AWS LB has more issues than that, too. I'm pretty sure it's just nginx under the hood but they won't tweak the simplest parameters for you, even if you make a colossal stink, even if your company spends seven figures a year. I wonder if it isn't a decade-old duct-tape-and-bailing wire solution that shares the same config across literally every customer or something. Rolling our own was almost a relief -- the pile of awkward workarounds had grown pretty high by the point we bit the bullet.
I have used Kubernetes extensively over the past couple years and have never seen pods stuck in terminating or creating state that didn't have to do with errors in container creation (your Dockerfile/bootstrapping is messed up) or issues with healthchecks.
We spawn thousands of pods per day for jobs and never get those stuck and it was not the case in 2018 either. Not sure what is it you are doing causes this.
Around 2015, I was at Spotify, and we were using a container orchestrator build in-house named Helios. They didn't build it because Kubernetes wasn't invented there; they built it because Kubernetes didn't exist, yet.
To be fair, multi-node networking of any sort is on a different level than single-host docker networking.
If you ever tried to use Docker Swarm to network multiple nodes, god help you.
Also worth noting that almost all users of K8s don't actually need to operate a cluster, the hosted offerings handle all of that for you. You just need to understand the Service object, and maybe Ingress if you're trying to do some more advanced cert management or API gateway stuff.
It's a common meme around here to point in horror to the complexity that is abstracted away under the K8s cluster API, and claim that k8s is really hard to use. I think that's mostly misguided, the hosted offerings like GKE really do a good job of hiding away all that complexity from you.
Honestly I think that it's defensible to say that the k8s networking model is in most cases _simpler_ than what you'd end up configuring in AWS / GCP to route traffic from the internet to multiple VM nodes.
> Honestly I think that it's defensible to say that the k8s networking model is in most cases _simpler_ than what you'd end up configuring in AWS / GCP to route traffic from the internet to multiple VM nodes.
How is routing from the internet to multiple servers a problem?
usually, you have either one of these setups:
- you run a loadbalancer that distributes traffic across your nodes. (This loadbalancer could even be distributed thanks to BGP).
- you either run your own firewall or have a managed one, in which you either announce your IP prefix yourself, or they are announced for you by your uplink provider.
- you run an anycast setup (for, for example, globally distributed DNS). and announce multiples of the same prefix across the globe. Routing in the DFZ does the rest for you.
Streched L2 across the globe/internet is also possible (although not very performant) either by doing IPsec tunneling, or by buying/setting up L2VPN services. (either MPLS or VXLAN based).
I didn't say it was a problem. My claim was just that it's easier in GKE than in GCE/EC2.
I only mentioned multi-node because exposing a single VM to the internet is trivial -- just give it a public IP -- and thus is not an apples-to-apples comparison with the multi-node load balancing that you get from the entry-level k8s configuration of Service > Pod < Deployment.
the largest issue with kubernetes networking seems to be the lack of integration with modern datacenter networking technology.
Things like VXLAN-EVPN are supported on paper, but are no where near mature compared to offerings from normal networking vendors.
Heck, even the BGP support inside kubernetes is lacking. Which is a great shame because it creates a barrier between pods and the physical world. (Getting a VXLAN VTEP mapped to a kubernetes node is a major PITA for instance).
Most major cloud providers seem to have fixed this by building even more overlay networks (with the included inefficiencies).
You can make it as complicated as you want it to be; part of setting up a cluster is picking the networking system ("CNI"). Cloud providers often have their own IPAM (i.e. on Amazon, you get this: https://docs.aws.amazon.com/eks/latest/userguide/pod-network...; each Pod gets an IP from your VPC, resulting in weird limits like 17 pods per instance because that's how many IP addresses you can have for that particular instance type throughout EC2).
Yes, Kubernetes (actually its add-ons) provide a virtual network that unifies communication within the cluster so you don't need to care on which computer your service runs.
I once gave a lunch talk at Docker, inc, which included several slides of suggestions of features to add to Docker. Prominently featured was the request for a native command to clean old images. An engineer in attendance remarked incredulously that he could not believe that users would not know how to pipe docker ls into xargs docker rmi.
I hate this mentality because.. what if there is no xargs?
Don't just assume the standard scenario that everyone has this on their machine. Especially if your program is a statically linked "portable" go binary.
I dabble in embedded development, mainly as a hobby and I mainly deal with e-book reader devices.
Let me tell you this, if a device doesn't need a certain binary or tool, it ideally shouldn't be there. For security reasons etc.
And with embedded devices with a specific purpose it's highly likely that there will be very little there.
But anyway, I was mainly saying that in principle this should be considered.
Back in 2016 during the original discussion of this article, amount said it very well in [0]:
"If you hit this many problems with any given tech, I would suggest you should be looking for outside help from someone that has experience in the area."
- Yes, "clean old images" was not implemented back then. His hack is not that bad, and one can filter out in-use images if they want to pretty easily. Anyway, docker does have "docker image prune" now.
- Storage driver history discussion is entirely incorrect. No, docker did not invent overlayfs nor overlayfs2. There was a whole big drama of aufs not mainlining, but it was mostly in context of live cd's, not docker.
But the big missing thing is: you should not store important data in docker images, Docker is designed to work with transient container. If you have a database, or a high-performance data store, you use volumes, and those _bypass_ docker storage drivers completely.
- The database story is completely crazy... judging by their comments, they decided to store the database data in the docker container for some reason and got all the expected problems (unable to recover, hard to migrate, etc....). It is not clear why they didn't put database data on the volume, there is a 2016 StackOverflow question discussing it [0].
Also, "Docker is locking away [...] files through its abstraction [...] It prevents from doing any sort of recovery if something goes wrong." Really? I did recovery with docker, the files are under /var/lib/docker in the directory named with guid, a simple "find" command can locate them.
- By default, Docker uses Linux networking and yes, the configuration is complex so it adds overhead. That's why there is --net=host option (which was there for a long time) which just bypasses that all.
I know this article is from 2016...but my feelings about it(the article) are unchanged. Some people do not like new things, and they will blog about it in some form or fashion. Maybe their reasoning is valid, maybe it's not - it doesn't matter. Meanwhile...businesses have, and continue, to pay top $$$ for people that will help them do these things. If you want to collect this $$$, get on board.
In a few years, the things businesses want to pay $$$ will change. New blog articles about "this new stuff is bad!" will appear, and new job postings paying above-market $$$ will appear also. You can either rail on about the bad(or good) changes, and how it's just everything-old-is-new-again....or you can get with the program, and get paid. In another few years, rinse and repeat.
It's all anecdotal. For example I know many folks who make $$$ doing the boring old thing because it is reliable, it gets results quickly with low risk, the engineers know the tech inside and out, and not many other folks want to work with "boring tech".
We swapped most of our linux and vmware platforms to Triton and SmartOS and been loving it ever since. Obviously there's still need to run linux in bhyves due to some specific software (e.g. docker) but generally services are on either lx-zones or smartmachines. It just works.
In 2016 I started at a company that had no build procedures and deployed to a variety of linux versions, developed on windows. It was a nightmare for administration, no automation, no monitoring.
I implemented containers and most of the process was getting the developers on board. Having technical sessions with them to understand what they needed and ease them into the plan so they felt enfranchished.
Doing this vastly increased productivity, devs could take off the shelf compose files that were written for common projects (it was a GIS shop) and meant they could concentrate on delivering code. It helped no end.
Sure there's issues (albeit a lot fewer as time progressed) with docker but for what it gained in productivity and developer's sanity, it was very welcome.
> In 2016 I started at a company that had no build procedures and deployed to a variety of linux versions, developed on windows. It was a nightmare for administration, no automation, no monitoring.
It's no surprise that implementing pretty much any process would be a massive improvement from that.
The more interesting question is whether containers offer any advantages for people who already have a (probably significantly more lightweight) build and deployment process.
And at the heart of this there will be a database or some other data store that will be very much a pet. A very precious, important and fickle pet. Even if it's "serverless" it will have its own needs and wants.
This whole "cattle not pets" charade gets on my nerves. Yeah, it's easy to "scale" that which is stateless. Not so fast with stateful. I don't care that I can spin up gazillions of web server instances. My data store is still one and very much stateful.
I think stateless is good but don't forget that the state is just in another place, another stateful "gear" in your machine. And that, as you said, has to be treated like the pet it is.
Persistent identity is fine, it just doesn't belong on individual servers. If the data matters, then it's better for it to be reproducibly managed, durably persisted, etc.
For many database needs, there are great options that natively support a highly available cluster.
For those needs that don't have a good clustering option, there are great network storage systems that are easy to deploy and use.
You don't need to treat hardware as a pet to have a database with persistent identity.
This is a blogpost from 2016 . However if we switch to more recent times, my experience with AWS ECS and Fargate has been fairly boring. There was a learning curve to get it to work with cloudformation, vpcs, iam and load balancer .
> Docker is meant to be stateless. Containers have no permanent disk storage, whatever happens is ephemeral and is gone when the container stops.
It's interesting that this misconception made it into a clearly knowledgeable article. Containers have state on the writeable layer that is persisted between container stops and starts.
I’m not sure it’s a misconception. That’s how they’re intended to be used. Cattle not pets. If you don’t get used to not treating them as throw away you can end up accidentally relying on some state. As you say, the top layer is read/write but that doesn’t mean you should be relying on what you write there. Quite the opposite - that state should be somewhere else unless you can afford to lose it.
I usually start mine with —-rm so they’re removed on shutdown.
I’ve seen people apply security updates via ‘apt update; apt upgrade’ within a running container. Guess what happens when that container is eventually destroyed?
But I think most people and tools consider "containers" to be volatile storage, like RAM. Non-volatile storage would be volumes.
Honestly I think there is a lot to be said for making the writable layer of a container read-only. It makes sure that things like logging, if you care about them, go somewhere safe, or if you don't, get turned off explicitly. And also prevents gotchas like "oops wrote important data to /var/lib/notavolume when I meant to write to /var/lib/therightvolume" that show up at the worst times.
The article seems to mention problems with AUFS, overlay and possibly overlay2 as well.
However one of the things that i haven't quite understood, is why people use Docker volumes that much in the first place, or even think that they need to use additional volume plugins in most deployments?
If it's a relatively simple deployment, that has some persistent data and it's clear on which nodes the containers could be scheduled (either by label or by hostname), what would prevent someone from just using bind mounts ( https://docs.docker.com/storage/bind-mounts/ )?
And if you need to store it on a separate machine, why not just use NFS on the host OS to mount the directory which you will bind mount? Or, alternatively, why not just use GlusterFS or Ceph for that sort of stuff, instead of making Docker attempt to manage it?
For example, Docker Swarm fails to launch containers if the bind mount path doesn't exist, but that bit can also be addressed by creating the necessary directory structure with something like Ansible - and then you're not only able to not worry about volumes and the risk of them ever becoming corrupt, but you also have the ability to inspect the contents of the container storage on the actual host. Say, if there are some configuration files that need altering (seeing as not all of the containerized software out there follows 12 Factor principles with environment configuration either), or you just want to do some backups for the data that you've stored in a granular fashion.
We had a fun issue with Docker yesterday: suddenly, services in our Swarm did not start, apparently because a config could not be mounted. They were running fine for over two years and nobody had touched anything in the Swarm config.
Turned out, AWS had decided to upgrade Docker on the server and that version (20.x) is not able to launch services in the Swarm. We have downgraded to 18 now, which works, but is not a long-term solution.
Podman and Kubernetes are like a match made in heaven. Docker was a good first try for most people, but there is so much better technology that exists now.
I felt the same way when deploying to production with docker manually back then.
But honestly after we used k8s we everything just works. Although I realise that GKE is actually moving to containerd for newer clusters? Not sure what made the decision, but last time I had to restart a container manually due to my stupid mistake, the api doesn’t seem to be much different
Guy who claims to run systems in the hft space, responsible for millions of trades with high values, can’t be bothered to actually pay for support, relies on community and blames everyone but himself for being left alone with his mess. Not sorry.
Docker really used to crash a lot back in the days, mostly due to buggy storage drivers. If you were on Debian or CentOS it's very likely that you experienced crashes (though a lot of developers didn't care or didn't understand the reasons the system went unresponsive).
There was notably a new version of Debian (with a newer kernel) published the year after my experience. It's a lot more stable now.
My experience is that by 2018-2019, Docker had mostly vanished as a buzzword, people were only talking about Kubernetes and looking for kubernetes experience.
edit: at that time Docker didn't have a way to clear images/containers, it was added after the article and follow up articles, I will never know if it was a coincidence but I like to think there is a link. I think writing the article was worth it if only for this reason.