Junior dev at my workplace keeps telling me how efficient docker is.

He decided to solve his latest task with a containers in swarm mode.

As expected, things went sideways, and I had the joy of cleaning up behind him.

A couple of days later, I noticed that I was running low on disk space - odd.

Turns out docker was eating up some 60 GB with a bunch dangling images - efficient is a funny term for this.

  • 7
    It's "efficient" in terms of dev time. Just being able to grab and spin up an image at will, and expose it on arbitrary ports is a lifesaver when you deal with a bunch of backend services you need to test. Likewise it's efficient knowing that you can spin up an image that's essentially identical whatever underlying platform it's running on.

    That's where the efficiency lies though, not in terms of compute resource. I regularly spin up multiple containers that need at least 10GB of RAM each to run sensibly, never mind disk space. It'd be way slimmer if I just installed things locally, but a way bigger hit on my time too, and way more faff. I'd prefer to run a machine with a minimum 64GB RAM and be done with it.
  • 3
    That seems like a big deviation from standard code for a junior to make without consulting a senior or at least someone that knows more about how the company solves things usually.

    If I just started spinning up docker images on my company servers for random solutions they would surely take my accesses away from me, these things need to be planned out and time allocated for testing and maintaining...
  • 2
    I feel like there's some process issue here more than some noob doing something wrong.

    Granted if there are no such processes for doing these things and it is a shoot from the hip type environment ... yeah well that's gonna happen.

    Reminds me of the post about how a noob wrote some shitty authentication service.

    The problem isn't that some noob wrote a shitty auth service, it is that ... nobody helped them / it was allowed to happen.

    noob is gonna noob and that's straight up ok as long as they learn. Any other impact is some sort of structural issue in the company.
  • 2
    @N00bPancakes I'm totally with you. If I'm honest, structural issues at my employer are worth a bunch of rants, don't get me started in a comment.
  • 2
    @Hazarth We host "Christmas Projects" every year for interns junior devs. Generally, we've made good experiences with this sort of projects but from time to time, I come across very ... interesting... solutions. This was one of them.
  • 2
    I wrote an entire SaaS backend for a decently sized and moderately popular hosting platform. It used docker to schedule customer workloads.

    Docker is some of the worst mainstream garbage software I've ever had the misfortune of working with. Anyone who says it's efficient is wildly misinformed and needs to be put in their place. They are objectively and stupidly wrong.

    For the internal kernel cgroup locks alone, each container takes at least 300ms to spin up. Upon a page load for a container that is spun up as a result of a scale-up (not how I would do it these days but I digress), that means the user will wait at least 300ms if you do on-demand scaling.

    Not only that, but Docker's HTTP API gets bogged down if you issue more than about 10 or 15 commands in a short period of time, and will PERMANENTLY deadlock in trivial cases. It required cordoning the node and basically reprovisioning services CONSTANTLY.

    Fuck Docker. Fuck Kubernetes. Terrible software.
  • 1
    > Turns out docker was eating up some 60 GB with a bunch dangling images - efficient is a funny term for this.

    Yes, and the file formats for the image/layer bookkeeping are a JOKE. There are THREE (!!!) different database libraries they use, in seemingly random spots, to keep track of everything. Further, there is redundant information everywhere, meaning if you change one, the others are immediately corrupted and docker refuses to work at all.

    The built-in tools for docker are decent nowadays (for managing images) but we still had to resort to fucking weird hacks back then to clean up dangling images and containers. It was such a drag.

    And don't get me started on the internal registry protocol. If you host one in-DC, then naturally the acquisition speeds are going to be fast (1gbps at the least). Therefore, compression makes no sense and only slows you down.

    It's hardcoded in, though. You can't disable it.
  • 3
    And Kubernetes, as if it's some second coming of christ. It's not. Systemd did it first and did it much, much better (yes, Systemd supports containers, too!)

    Kubernetes uses a slow-poll of 5s, meaning that it will look for new jobs every 5 seconds. That means if you submit a job directly after its cycle, the best case is that you wait 5 seconds for literally anything to happen.

    It also doesn't handle concurrency well at all, meaning that if the system is overloaded due to rebalancing storms (which K8s seems to love doing...) then the kube master will deadlock and you have a datacenter that can't scale up, down, or even be managed remotely.

    This is especially deadly when it comes to attacks or usage spikes, because usually this causes a scaling wave (if you use autoscalers) and thus will often overload your master to the point you can't control it anymore.

    And forget about coming back up from a complete failure. That shit is only doable by masochists.
  • 0
    Docker just needs some experience. I fucked up my system so many times just by having huge log files. I suck
  • 0
    Didn't know about systemd container orchestration. Will look into that.
    Container virt is the most efficient virt in terms of overhead. The cost is in disk space and that is why alpine is so popular. It's images are about 140MiB. Line most systems cleanups and general maintenance is required.

    Bare metal will always be faster but can leave a dependency need l mess. This is where container virt shines.
  • 0
    @hjk101 What do you mean "container virt"? There is no virtualization happening in a container, that's why they are _distinctly_ different from VM's (which *are* virtual).

    There is no hypervisor happening within a container. Just locked down and filtered syscalls, which add no overhead other than boot-up times.
  • 0
    @junon yes there is most definitely virtualization happening. Virtual machine is another level but not the only virt tech out there OpenVZ was one of the first.
  • 0
    @hjk101 Containers 100% do not have virtualization unless you're running them under Hyper-V or the like.
  • 0
    @junon don't just ignore everything I say. Especially if I give you the documentation that proves you wrong. Your definition of virtualization is wrong/limited and containers do have overhead.

    Have another one:
    Docker is a set of platform as a service (PaaS) products that use OS-level VIRTUALIZATION to deliver software in packages called containers.
  • 0
    @hjk101 OS Virtualization != Virtualization. They are new namespaces of resources in the kernel, nothing more.

    I didn't ignore you, but I have a lot of domain knowledge about containers.
  • 0
    @junon sure you're all knowing...
    In the world of the all mighty virtualization is exactly the same as hardware emulation.

    Just don't be disappointed when you run into a senior in real life who actually worked on kvm, application virtualization, virtual networks and yes LXC (which is mainly stuff with cgroups but docker is lot more especially on networking and storage)
  • 0
    @hjk101 Nice elitist attitude. Not that you deserve a response, but regardless of your dick swinging contest, you're still not specifying properly. Cgroups are not virtualization.

    I know what LXC is. I know what docker does. Containers are not virtualized, they are just glorified cgroup jails. If they've changed the definition of "container" then great, but it was never "containers are lightweight VMs". They are jailed namespaces for processes to run inside. If you virtualize a container, great. But a container itself is not virtualized.

    If you want to argue about semantics more, then be my guest. Your shit attitude removed any will to respond further though.
  • 0
    The overlay FS is not virtualization. The SDNs are not virtualization. Emulation, sure. Not virtualization.

    You can run a container within some sort of hypervisor if you'd like. That doesn't change much about the container, and everything about the processes inside the container.

    The fact you can virtualize a container doesn't make containers virtualized.
Add Comment