Creating Kubernetes distributions
Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
Making a comparison between Linux and Kubernetes is often one of apples to oranges. There are, however, some similarities and there is an effort within the Kubernetes community to make Kubernetes more like a Linux distribution. The idea was outlined in a session about Kubernetes release engineering at KubeCon + CloudNativeCon North America 2019. "You might have heard that Kubernetes is the Linux of the cloud and that's like super easy to say, but what does it mean? Cloud is pretty fuzzy on its own," Tim Pepper, the Kubernetes release special interest group (SIG Release) co-chair said. He proceeded to provide some clarity on how the two projects are similar.
Pepper explained that Kubernetes is a large open-source project with lots of development work around a relatively monolithic core. The core of Kubernetes doesn't work entirely on its own and relies on other components around it to enable a workload to run, in a model that isn't all that dissimilar to a Linux distribution. Likewise, Pepper noted that Linux also has a monolithic core, which is the kernel itself. Alongside the Linux kernel is a whole host of other components that are chosen to work together to form a Linux distribution. Much like a Linux distribution, a Kubernetes distribution is a package of core components, configuration, networking, and storage on which application workloads can be deployed.
Linux has community distributions, such as Debian, where there is a group of people that help to build the distribution, as well as a community of users that can install and run the distribution on their own. Pepper argued that there really isn't a community Kubernetes distribution like Debian, one that uses open-source tools to build a full Kubernetes platform that can then be used by anyone to run their workloads. With Linux, community-led distributions have become the foundation for user adoption and participation, whereas with Kubernetes today, distributions are almost all commercially driven.
Why distributions matter
The real value that comes from Kubernetes and from Linux in Pepper's view, is not from the core, but rather from the user applications that a full distribution enables. Distributions are purpose-built, opinionated assemblies of configurations and tools. Distributions also serve to align different versions of tooling and subprojects into a working release that is easier for users to install and maintain. "One of the things in open source that is really amazing is you have this multiplier effect and distributions are a key part of that," Pepper said.
A Kubernetes distribution is a bit different than a Linux distribution in several respects. With Kubernetes, the Cloud Native Computing Foundation (CNCF) has developed a Kubernetes conformance program to certify that a given platform is in fact Kubernetes. Pepper noted that Linux makes use of a reciprocal open-source license, which means that any code that is forked and distributed needs be shared. Kubernetes uses a permissive license (Apache version 2.0), which Pepper warned comes with the risk of divergent forking. "So where Linux didn't necessarily have conformance testing, we need something like that in Kubernetes to make sure that Kubernetes as a word means something, and that we can understand what that means," he said.
Linux has a large stable of community distributions, such as Debian, Arch, and Fedora, as well as commercial enterprise distributions. "Where are our Kubernetes community distributions?" Pepper asked. "Of the hundred conformant offerings, most of them are commercial." The full list of conformant Kubernetes offerings is maintained and regularly updated by the CNCF.
Building a community Kubernetes distribution
Pepper outlined several potential reasons why there isn't a community Kubernetes distribution, including the fact that there are some missing technical components. He started by attempting to define what the base of a community distribution could include. There are the raw Go language binaries and some other code artifacts from the Kubernetes release, but those are only parts of a distribution. There are also several tools needed, including kubeadm, which helps to bootstrap a basic Kubernetes cluster, kops for managing Kubernetes operations, and kubespray, which is a used to deploy a production-ready Kubernetes cluster. Pepper emphasized that the existing open-source tools are intended to help build a cluster and not a distribution.
The Kubernetes community is currently lacking build tools for distributions as well as more robust dependency management, he said. "One of the really useful benefits you see from distros is that they they kind of grok all of the dependencies and give you that coherent opinionated set of things that are going to work together," Pepper said. "Where is our Kubernetes equivalent of koji or Launchpad?" He also wondered why there was no Kubernetes version of Ubuntu's personal package archives (PPAs).
Release engineering
While Kubernetes currently is missing pieces for enabling a true community distribution, work is ongoing in multiple Kubernetes Special Interest Groups (SIGs), including SIG Release and SIG Testing that could point the way forward to a future community distribution.
Stephen Augustus, another SIG Release co-chair, explained that a release-managers group that deals with the build process as well as patch and branch management has started to take shape. The idea behind the group is to codify the process by which Kubernetes releases are produced. "There are scripts that you can check out that have copyright dates of 2016 and they are actually the ones that are responsible for releasing Kubernetes," Augustus said. "We want to get to the point where we can start tearing down some of the technical debt that we've built up in the project over time."
Among the Kubernetes release scripts that date back to 2016 is anago, which is an 1,800-line bash script for releasing Kubernetes. Anago imports three separate libraries, each with another 500 lines of shell code. "It's time to not do that anymore," Augustus said.
The group is starting to rewrite some of the release scripts, one of the first targets is branchff, which is a utility that fast-forwards a branch to the master. Another tool that is being rewritten is push-build, which is responsible for pushing all of the Kubernetes builds up to the Google Cloud.
As part of the overall effort to improve release engineering, there is also the new Kubernetes release toolbox project known as "krel" that Augustus noted is just getting started. The goal is to take all of the various release shell scripts and move them into the toolbox as a set of commands. Another new effort that is getting underway is the kubepkg tool that will enable developers to create deb and RPM packages based on Kubernetes project binaries. "We want there to be a dead simple way to produce debs and RPMs for Kubernetes."
Augustus commented that many companies have built their own tools for Kubernetes releases because there have not been any great tools in the upstream project, but that's now changing. "We're trying to kind of flip that story, change the narrative, and build tools that are actually useful for not just the community, but for for vendors, and for hobbyists to consume as well."
Whether or not a real Kubernetes community distribution will emerge remains to be seen. What is clear is that, as Augustus said, there is a need to remove the technical debt for release engineering, updating complex shell scripts with more modern tools that can help both the project and the broader community to build Kubernetes distributions.
Index entries for this article | |
---|---|
GuestArticles | Kerner, Sean |
Conference | KubeCon NA/2019 |
(Log in to post comments)
Creating Kubernetes distributions
Posted Dec 5, 2019 15:05 UTC (Thu) by rwmj (subscriber, #5474) [Link]
Creating Kubernetes distributions
Posted Dec 5, 2019 15:13 UTC (Thu) by sml (guest, #75391) [Link]
Creating Kubernetes distributions
Posted Dec 5, 2019 17:54 UTC (Thu) by SEJeff (guest, #51588) [Link]
Creating Kubernetes distributions
Posted Dec 15, 2019 10:31 UTC (Sun) by ofr (guest, #107486) [Link]
Creating Kubernetes distributions
Posted Dec 15, 2019 11:13 UTC (Sun) by rahulsundaram (subscriber, #21946) [Link]
Can you provide a reference to this?
Creating Kubernetes distributions
Posted Dec 16, 2019 20:09 UTC (Mon) by ofr (guest, #107486) [Link]
Reference for what exactly? It's obvious that it's not 100% compatible because many software packages for Kubernetes don't run unmodified on OpenShift. For the claim about the conformance test see https://github.com/openshift/origin/blob/master/test/extended/conformance-k8s.sh
Creating Kubernetes distributions
Posted Dec 16, 2019 20:14 UTC (Mon) by rahulsundaram (subscriber, #21946) [Link]
Creating Kubernetes distributions
Posted Dec 5, 2019 19:14 UTC (Thu) by marcH (subscriber, #57642) [Link]
Was any alternative suggested? Didn't find any in the Powerpoint.
A few thousands lines of shell script is too high but not crazy high IMHO.
Unix shell scripting shows its age but I haven't seen anything coming close for smaller programs (say a few hundred lines) interacting with files and gluing other programs together. Nothing as concise, dynamic and high-level. For instance using functions as parameters is trivial - not too bad!
One major drawback is incompatibility with Windows but hey, who implements release management and QA on that? ;-) Most Windows people I run into look like they haven't even heard of PowerShell yet. Click, click, click... or straight to WSL.
Python's subprocess module has come a long way but still requires boilerplate and relatively complex error handling code, people seem to get that wrong every time (error handling code being of course never tested).
Maybe Perl would have been a good candidate if it hadn't committed suicide by optimizing itself for "write-only" usage?
Is there anything else?
Creating Kubernetes distributions
Posted Dec 6, 2019 23:27 UTC (Fri) by IanKelling (subscriber, #89418) [Link]
Creating Kubernetes distributions
Posted Dec 7, 2019 18:48 UTC (Sat) by epa (subscriber, #39769) [Link]
Python's subprocess module has come a long way but still requires boilerplate and relatively complex error handling codeCan you give an example of how to do error checking and handling correctly in a shell script? It seems to require at least as much boilerplate as Python or Perl if you want to write | pipelines or do control flow while at the same time checking the exit status of each subprocess and perhaps checking whether anything was written on standard error too.
Creating Kubernetes distributions
Posted Dec 7, 2019 21:50 UTC (Sat) by Jandar (subscriber, #85683) [Link]
If you use #!/bin/bash "set -o pipefail" gives you checking of exit status of every part of a pipe.
Python as a shell replacement
Posted Dec 7, 2019 23:22 UTC (Sat) by marcH (subscriber, #57642) [Link]
BTW C has similar error handling behaviors, most likely not a coincidence.
I repeat: any shell program longer than a few thousands lines or with some serious data structures is probably a mistake. This being out of the way, let me try to rephrase and clarify what I meant earlier:
1. Python-as-a-shell adds extra code and significant overhead; you can't start prototyping by just throwing your .bash_history into a file anymore.
2. Python has a built-in and pretty good exception system that you generally don't even have to think about.
So why did the migration overhead I paid in 1. didn't magically give me 2. for free? Why do I have to think so much about error handling when I use the subprocess module? In _short_ shell scripts good error handling is the only thing I was missing! So where did my migration money go?
The Python people are very smart, so I guess there must be good technical reasons for that, yet these excuses still don't make Python a desirable replacement for short shell scripts (unless you absolutely need to support Windows). Actually, I'm worried these justifications may not be Python specific and may preclude _any_ general purpose language as a shell replacement...
Insightful interview with Steve Bourne: https://www.arnnet.com.au/article/279011/a-z_programming_...
Python as a shell replacement
Posted Dec 13, 2019 17:23 UTC (Fri) by BenHutchings (subscriber, #37955) [Link]
I also use "set -e" by habit, but it doesn't do exactly what you probably want. When you check the result of a command, that completely suppresses its effect inside the command. For example:set -e f() { false echo "continued" } f || echo "failed"prints:
continued
Python as a shell replacement
Posted Dec 13, 2019 22:59 UTC (Fri) by marcH (subscriber, #57642) [Link]
Some influent and vocal experts seem to have decided that, short of catching "all errors", catching "no error" is better than "many errors". I've read all their essays and I still couldn't make sense of their logic https://mywiki.wooledge.org/BashFAQ/105
"Works for us".
PS: besides 105 and a couple others, https://mywiki.wooledge.org/BashFAQ is the best.
Python as a shell replacement
Posted Dec 14, 2019 0:08 UTC (Sat) by karkhaz (subscriber, #99844) [Link]
I occasionally use a combined shell script/makefile if I care about catching errors on each command:
#!/bin/sh
# vim:set syntax=make:set ft=make:
MAKEFILE_START_LINE=$(\
grep -nre makefile_starts_here "$0" \
| tail -n 1 \
| awk -F: '{print $1}')
TMP=$(mktemp)
tail -n+${MAKEFILE_START_LINE} "$0" > "${TMP}"
make -f "${TMP}"
SUCCESS=$?
rm -f "$TMP"
exit "$SUCCESS"
makefile_starts_here:
command-1
command-2
command-3
This prints out everything below and including "makefile_starts_here" to a Makefile and then runs make on it, executing the commands one at a time. This is especially nice if I want built-in parallelism etc, it's actually even better than just using the shell (just ensure to print out "MAKEFLAGS=-j" at the top of the file).
Typhoon is a free Kubernetes distribution
Posted Dec 15, 2019 10:29 UTC (Sun) by ofr (guest, #107486) [Link]
"Typhoon distributes upstream Kubernetes, architectural conventions, and cluster addons, much like a GNU/Linux distribution provides the Linux kernel and userspace components."
So, there you go.