Skip to main content
Version: 0.0.2

Kubernetes

A multi node cluster virtualization technology that emphasizes high availabilty and performance. Took a long time to actually take the first step in starting to learn how kubernetes works. It's for the most part similar to docker compose in that they both use yaml in order to declare containers and services, but it takes virtualization management to a next level. In constrast to docker, kubernetes requires you to understand almost every aspect of what you creating. The kubernetes distribution I started with was k3s.

Why I left

This wasn't something I expected to do ever once I learned kubernetes but there were issues that bothered me a lot when I was running it. I had to hack around coredns due to issues with it's dns resolvers. Running a multi node setup required constantly needing to check up on two computers rather than one. Also local path provider broke constantly, leaving folders and not creating folders sometimes if it it's runtime class was not synced correctly by argocd. Ultimately the extra overhead and high availablity weren't needed anymore in my mind, and going back to docker would save myself a lot of headache and maintenance for the server. Learning it all was incredible and I am so happy I took the time to learn all of it, I learned so much from the experience. Going back to docker compose with all of this new found knowledge I was able to build an equally self sufficient setup without all of the issues I ran into with swarm and kubernetes.

Apps

Deployments

The most used type of resource in Kubernetes, defines a ReplicaSet, meaning a pod (or service in docker world) that will replicated across nodes based on selectors, that will be deployed onto the cluster. This was the first resource I looked into as it was the closest thing I understood to be to services in docker. There's various different parts to a deployment, but essentially it's the definition of a Pod with a ReplicaSet. There is the posibility of creating a single pod, but it loses the ability to be highly available, as in there is no auto recovery in the case the pod goes down.

DaemonSets

Like deployments it's a high level abstraction over a Pod, but with a different purpose than a deployment. The purpose of a DaemonSet is to be replicated by every node defined in the selector. The expectation would be that if the node selectors select N nodes, then there will be N replicas. I've found really the only purpose to use this was to add log aggregation for all deployments running on the cluster.

CronJobs

This was very cool to learn in kubernetes as it allowed me to get really creative with running some automated tasks. This type of abstraction deploys a pod to be executed at a specific time during the day and it would be auto cleaned up after it's execution. In addition it allows for retries on the cron job in the case of failures. One really cool automation was to create cron jobs for each deployment that ran in the cluster to automatically create restic backups for persistent volumes in the cluster.

Resources

Persistent Volumes

This took some time to wrap my head around as it's a move away from simple file and folder mounts from docker. A persistent volume is a file or folder, that is declared as a mountable volume for the cluster to be used. In order to define a persistent volume you need to define the type of filesystem, which has so many [different types](https://kubernetes.io/docs/concepts/storage/persistent- volumes/) in kubernetes. I've only really used local, nfs, and hostPath when creating a set or job. You can also define read/write policies to limit how many sets can mount this volume and if it can be modified. You also must define a node selector to indicate this volume can only be created on specific nodes. Persistent volumes exist to allow the cluster to automatically clean up volumes once they are no longer in use. This is defined by the [reclaim policy](https://kubernetes.io/docs/concepts/storage/persistent- volumes/#reclaiming).

note

To use nfs it requires installing a driver to the cluster.

warning

Recommend not using very specific node selectors as it became an issues when migrating persistent volumes from one node to another. Node selectors are immutable and therefore not very easy to change once it's declared.

note

When defining a reclaim policy I have always used retain as I did not want to lose any data after the volume no longer has claims.

Persistent Volume Claims

Now here is where the confusion starts with persistent volumes. In order to use a pv within a set or job, a persistent volume claim must be declared to claim the volume. The claim's purpose is to ensure that the claim is requesting resources that are less than or equal to what is provided by the persistent volume. This can be storage requests as well as read/write operations. Using local and hostPath volumes there is no enforcement of storage size (this might specifically be a k3s thing). Claims also is an easy way for the cluster to manage when the persistent volume is ready to be handle by the retention policy setting.

warning

When using node selectors on persistent volumes, try not to be too specific, due to node selectors being immutable you can't migrate a PV from a specific node to another without needing to delete the PV completely. This was insanely annoying with argocd.

DNS Config

In majority of cases I don't think I actually needed to learn about DNS config for Pods, but due to some weird network performance issues when first setting up Kubernetes clusters, I need to tinker with this setting. This setting allows to set the primary DNS servers that the Pod will use as well as overriding other dns options. An option I played around with a lot is ndots which defines that if a url appears with less dots than what is defined than an absolute query will be triggered before iterating through the list of search domains. Reducing it to one in Pods improved network performance as there are four different search domains defined by default.

Config Maps

A resource that allows defining a static file(s) or environment variables that can be mounted into a container. I originally only used it when needing to set up a resuable set of environment variables. But overtime I took advantage of it's ability to mount files into a pod. It made it really easy to avoid needing to mount a persistent volume just to modify a single file and there was also no need to ssh into the node that contained the persistent volume to make an update.

note

One caveat around ConfigMaps volume mounts was that on data change it does not trigger a reload of the container. So that required manual intervention when making an update to the file.

Secrets

A resource that allows mounting sensitive information in a secure and encrypted manner into pods. At first I never actually found a reason to use secrets as I just directly used sensitive data in environment variables without much concern. But as I became more comfortable with kubernetes I was able to get a better understanding on how to better set this up to provide better security for my kubernetes repo. An initial finding was that it was so tedious to get everything setup initially to be used, but I eventually learned abount Kubernetes Secret Operators that would auto-generate kubernetes secrets from a secrets manager. Using Bitwarden's SM Operator I was able to automatically create and define all secrets used in the cluster.

Role Based Access Control

A collection of resources that defines a Pod's ability to interact with the kubernetes cluster API. Only used in a couple of Pods in the cluster but its concept was pretty easy to understand.

Services

A resource that defines what ports are open for a particular pod to allow for inter-cluster communication. By default, unlike docker, apps are not able to communicate with one another, they can only communicate with containers within the same pod. To allow for inter-cluster connections you need to define a ClusterIP service that would open up ports to the rest of the cluster. These ports are defined in the container definition and can be either a number or a string represnting the name of the port. In order to bind a service to the Pod, it's selectors must match the labels of the app. There's also LoadBalancer services that allows for binding to the host port of the node.

Operators

When searching for how to appropriately hide sensistive information from directly existing in manifest files I came across operators and secrets managers. Operators are essentially pods that have permissions to autogenerate kubernetes resources based on a request in a manifest from a deployed pod.

Security Context

I never saw the need to utilize this other than granting high level access for the vpn or media pods that need access to network rules and accessing the igpu on a node. But recently I looked more into avoiding running pods without root access, and using security context is how you can enforce this. In addition it would be good to ensure all files created in volumes are properly using the correct permissions. I decided to follow a default setting on the pod level to ensure that the user id is 1000 and group id is 100, but for certain containers like postgres or any linuxserver/hotio it needs to be run as root.

For postgres I wanted to look into how to avoid this as it forced all files to be created as root as the other containers would fix the permissions based on the PGID and PUID environment variables. In order to fix that I needed to create an init container to chown the files created by postgres before it executed and this allowed it to run as the specified user.

Noteworthy Pods

CoreDNS

A DNS service that was automatically deployed with k3s that handles all lookups originating from pods. Originally I had no need to really learn about this, but due to the network performance issues with pods I spent some time understanding how the config files works. It runs through some logic where it checks against the url being looked up against the cluster domain checks what services match it. If there is no match the lookup is forwarded to the node it's running on. Well that latter part I found was a major issue in the performance investigation, removing the passthrough to the node and replacing it with google DNS, I no longer found errors or timeouts in lookups. Currently I have an application set that overrides coredns default config as there's no way to make a change to it's config map and keep it persistent.

Local Path Provisioner

This service allows kubernetes to automatically create directories for persistent volume claims on a node for which it is requested. This is called dynamic provisioning. This comes built in it with k3s but for the longest time I avoided using it due it's delete behavior where it would cleanup and delete the directory on the node. Instead I manually created Persistent Volumes and Persistent Volume Claims to override this. After getting frustrated with needing to create new folders for any new deployments I decided to look into how I can utilize it

Local Path Provisioner default configuration can be overrided similar to coredns where you need to edit a config map that the deployment listens to. In that config map is where you can define the default directory, or node path, that will be used by the provisioner, and it's setup and teardown instructions which are stored as bash scripts. I updated it to use the standard /kubernetes- appdata folder and updated the setup script to set the correct user and group permissions. In addition I disabled the teardown script to not actually delete the folder.

I then created a new storageclass that would automatically create the folder based on an annotation key given to the pvc, and in using the new storageclass with a pvc I was able to automatically setup all data needed for new deployments. One caveat was that I needed to retain the old configuration for daemonsets that need persistent volumes due to the provisioner not being able to work with ReadWriteMany access modes.

K8s Declaratives

Manifests

Manifests are the default way of defining kubernetes resources to be deployed into a cluster. The file format used is yaml and the structure must adhere to the resource definition.

Commands:

  • Apply manifest to cluster:
    kubectl apply -f (folder|file)
  • Remove resources in manifest from cluster:
    kubectl delete -f (folder|file)

Kustomize

Using manifests I found myself duplicating configurations multiples times when writing deployments and creating other resources and I attempted to find tooling that could help reduce the duplication of configurations. Kustomize was an interesting tool to use, as it allows pooling multiple templates and modifiers in order to build the resources for an application. But I found that I was spending so much time trying to get these modifiers to work and I was creating even more configurations in order to get it to match the expected behavior.

Commands:

  • Execute and apply to cluster:
    kubectl apply -k (folder)
  • Remove resources:
    kubectl delete -k (folder)

Helm

The next tool I used in trying to deduplicate configuration files was helm and it was actually very well suited for the job. It did exactly what I was trying to accomplish with kustomize, but it also gave me the flexibility of naming and defining how exactly I wanted the resource to be generated based on data defined on a values file. In addition I could run lint on helm templates to help validate that the configuration I wrote was in fact correct and won't have unexpected paths. One thing I disliked about helm app was the fact it needed to be installed from a repo and a chart configuration was needed to apply it in a certain way. The command to generate a manifest with helm

Commands:

  • Test helm templates:
    helm lint
  • Generate manifests:
    helm template . --name-template (app_name) -f (values_file)

cdk8s

I found cdk8s when reading up on how helm was not a great tool to auto generate manifests file from base customizations. There were also frustrations expressed with using helm in that it has so many formatting issues and a yaml language with the ability to build like that was not the best way. Using cdk8s you are able to use code to define how manifests are generated and it doing so allows for a much cleaner deduplication flow. Compared to helm it was a much more enjoyable experience and allows for better customization without need to write 'custom' settings everywhere.

It can also import resource definitions from what is installed and available in the cluster and type checks it when synthing, so it comes with built in linting. You can also define resources using helm so I was easily able to charts and repos without needing to manually install them into the cluster and letting cdk8s handle all of it when it's synthing and being applied.

Commands:

  • Generate manifests:
    cdk8s synth
note

When using renovate with cdk8s I needed to setup custom regex matchers to upgrade the image versions defined in python files. In addition I could specify post commands to run when executing renovate upgrades so it actually removed my need to manual intervene on PRs as it would update the code and generate the manifests after.