The K3S Cluster

I was discussing IaC at home on Reddit today and I noted that I have no use for Ansible anymore.

Ansible went from something that was essential to have installed, to something that was just a painful niceity to setup almost overnight. Running everything in K3S is just so much easier, and the way that my cluster currently runs is so much nicer than anything that I’ve had before.

That said, I don’t have a build template because I just tackled fires as they came up. The discussion made me realize that I don’t really know some of the configurations that I made, so I’m going to recreate the cluster to document some of the setup for future reference.

Installing Debian

I prefer Debian over all distros. I won’t get into my reasoning.

Minimal installation, make sure SSH is enabled. If installing from DVD, make sure network repos are configured post installation.

The non-root user is only used at home for the UID. Add public key to both. It’s a home setup, big deal.

I have a template with all of this configured on PVE purely for time saving when building test vms.

Installing K3S

Install K3S

apt -y install curl
curl -sfL https://get.k3s.io | sh - 

Install Longhorn

Used for sharing a volume between servers in the cluster

apt -y install open-iscsi
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.6.0/deploy/longhorn.yaml

If longhorn-manager-* gets stuck, check out the logs of longhorn-driver-deployer-* for more information. This is how I found out about the open-iscsi requirement.

This will take over a minute to show running across the board.

Install FluxCD

Used for pulling Kubes manifests from Gitlab and applying them to the cluster.

Install FluxCD with the following one-liner:

curl -s https://fluxcd.io/install.sh | bash

Go to Gitlab and generate a PAT. This needs read/write access. If you have Enterprise, I think you can use a GAT.

I have a group called ‘hxme’ for my home IaC. The following command will use hxme/kubes as the owner because that’s just what Flux needs. The “repository” as appended to this, and the “path” is used inside this repository.

With the following bootstrap command run, Flux will look at the repository ‘https://gitlab.com/hxme/kubes/deploy’. It will then look at the ‘clusters/hxme’ inside this directory.

Flux will use your ~/.kube/config for connecting to your Kubes cluster, so you need to copy it over. By not specifying your PAT at the command line, it will also prompt you for your PAT which is slightly more secure.

mkdir -p ~/.kube
cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
mkdir -p ~/scripts/flux && cd ~/scripts/flux
cat > 1-bootstrap.sh << EOF
flux bootstrap gitlab \
--owner=hxme/kubes \
--repository=deploy \
--branch=main \
--path=clusters/hxme \
--token-auth=false \
--read-write-key=true
EOF

Cluster Setup So Far

So I wanted to take a break to show you what I’m seeing so far. Here are my pods:

NAMESPACE           NAME                                                READY   STATUS                       RESTARTS      AGE
db                  db-84bd986b7c-7jthz                                 0/1     CreateContainerConfigError   0             109s
flux-system         helm-controller-5f7457c9dd-4p577                    1/1     Running                      0             2m38s
flux-system         kustomize-controller-5f58d55f76-jdp8w               1/1     Running                      0             2m38s
flux-system         notification-controller-685bdc466d-8shvn            1/1     Running                      0             2m37s
flux-system         source-controller-86b8b57796-9hnsh                  1/1     Running                      0             2m37s
gitlab-agent-hxme   gitlab-agent-v2-56cb9fdccf-dv825                    0/1     ContainerCreating            0             104s
gitlab-agent-hxme   gitlab-agent-v2-56cb9fdccf-vn5w2                    0/1     ContainerCreating            0             105s
kube-system         coredns-6799fbcd5-5pqc9                             1/1     Running                      0             20m
kube-system         helm-install-traefik-bfmd6                          0/1     Completed                    1             20m
kube-system         helm-install-traefik-crd-x452v                      0/1     Completed                    0             20m
kube-system         local-path-provisioner-6f5d79df6-szfbb              1/1     Running                      0             20m
kube-system         metrics-server-54fd9b65b-grlsd                      1/1     Running                      0             20m
kube-system         svclb-grafana-dd2fa981-fkzvz                        1/1     Running                      0             105s
kube-system         svclb-loki-ec79cab4-6x6sq                           1/1     Running                      0             105s
kube-system         svclb-node-exporter-9cf68493-x9qxv                  1/1     Running                      0             104s
kube-system         svclb-prometheus-b3e49542-fhqkl                     1/1     Running                      0             103s
kube-system         svclb-traefik-16fff6ad-76lnv                        2/2     Running                      0             20m
kube-system         traefik-7d5f6474df-5s48n                            1/1     Running                      0             20m
longhorn-system     csi-attacher-57689cc84b-jvvwx                       1/1     Running                      0             14m
longhorn-system     csi-attacher-57689cc84b-kphrz                       1/1     Running                      0             14m
longhorn-system     csi-attacher-57689cc84b-l9j6g                       1/1     Running                      0             14m
longhorn-system     csi-provisioner-6c78dcb664-4444l                    1/1     Running                      0             14m
longhorn-system     csi-provisioner-6c78dcb664-fxbzg                    1/1     Running                      0             14m
longhorn-system     csi-provisioner-6c78dcb664-xrk7p                    1/1     Running                      0             14m
longhorn-system     csi-resizer-7466f7b45f-86vbr                        1/1     Running                      0             14m
longhorn-system     csi-resizer-7466f7b45f-ctz52                        1/1     Running                      0             14m
longhorn-system     csi-resizer-7466f7b45f-ktqcz                        1/1     Running                      0             14m
longhorn-system     csi-snapshotter-58bf69fbd5-kh8tl                    1/1     Running                      0             14m
longhorn-system     csi-snapshotter-58bf69fbd5-wx79h                    1/1     Running                      0             14m
longhorn-system     csi-snapshotter-58bf69fbd5-xqz4s                    1/1     Running                      0             14m
longhorn-system     engine-image-ei-acb7590c-469cn                      1/1     Running                      0             14m
longhorn-system     instance-manager-6e6f0cba472bc7330d347603cdf42eb4   1/1     Running                      0             14m
longhorn-system     longhorn-csi-plugin-rdcx8                           3/3     Running                      0             14m
longhorn-system     longhorn-driver-deployer-576d574c8-rcqnb            1/1     Running                      0             14m
longhorn-system     longhorn-manager-c8rgf                              1/1     Running                      1 (14m ago)   14m
longhorn-system     longhorn-ui-556f7bb76c-6g5v5                        1/1     Running                      0             14m
longhorn-system     longhorn-ui-556f7bb76c-w6xvh                        1/1     Running                      0             14m
monitoring          grafana-755dd5fc65-7ztl8                            0/1     ContainerCreating            0             104s
monitoring          loki-6cf6c946d4-twfhf                               0/1     ContainerCreating            0             104s
monitoring          node-exporter-78679879c5-5m57g                      0/1     Pending                      0             103s
monitoring          prometheus-577b7f74fd-26k47                         0/1     Pending                      0             103s
nextcloud           nextcloud-7567d97d69-sf4zj                          0/1     ContainerCreating            0             108s
nextcloud           nextcloud-cron-28663855-bcbjt                       0/1     ContainerCreating            0             58s

You can see that my monitoring, nextcloud and db namespaces have been deployed already. This is because I used an existing repository that already has these configured, so we can see that Flux is already trying to deploy some things to the cluster.

Because I know what should be getting installed, I can tell that some things are missing. Let’s check what is missing:

root@kp1:~# flux get ks
NAME            REVISION                SUSPENDED       READY   MESSAGE
coredns-conf                            False           False   Source artifact not found, retrying in 30s
database        main@sha1:80a14387      False           True    Applied revision: main@sha1:80a14387
dns                                     False           False   Deployment/dns/dns dry-run failed (Invalid): Deployment.apps "dns" is invalid: spec.template.spec.containers[0].name: Invalid value: "node_exporter": a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')
dns-conf                                False           False   Source artifact not found, retrying in 30s
flux-system     main@sha1:7d2bb6c4      False           True    Applied revision: main@sha1:7d2bb6c4
monitoring      latest@sha256:ac6a7ed3  False           True    Applied revision: latest@sha256:ac6a7ed3
nextcloud       main@sha1:aed32722      False           True    Applied revision: main@sha1:aed32722

So we can see that we’re missing ‘dns’ and ‘coredns-conf’.

The former is missing due to a misconfiguration, and later is missing because Flux cannot see it. That is because Flux does not have access to the remote repository, despite giving it our PAT.

Give Flux access to private repos

There are a few ways to configure access. I dont know why. I’m going to SSH keys this time around, because I like the idea of it a little more.

I don’t know what the differences are but I’m going to do this in a consistent method and the GitRepository is probably the most logical at the moment. In the past, I was running OCIs however I’m going to phase this out for the sake of consistency.

To pull Git Repos, we’ll need to configure an SSH key.

Generate a key just for flux and put a passkey on it:

root@kp1:~# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa
Your public key has been saved in /root/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:oYd+mAioUbDkckn3n2RCihXfq39M+/mkL3Stjx1DmsQ root@kp1
The key's randomart image is:
...

Add the key to your Gitlab then use it to auth to confirm and get the known host.

root@kp1:~# ssh git@gitlab.com
Enter passphrase for key '/root/.ssh/id_rsa':
Enter passphrase for key '/root/.ssh/id_rsa':
Enter passphrase for key '/root/.ssh/id_rsa':
PTY allocation request failed on channel 0
Welcome to GitLab, @xxx!
Connection to gitlab.com closed.

I noticed that known hosts has a new format now. This should be fine.

root@kp1:~# cat ~/.ssh/known_hosts
cat: /root/.ssh/known_hosts: No such file or directory
root@kp1:~# ssh git@gitlab.com
The authenticity of host 'gitlab.com (172.65.251.78)' can't be established.
ED25519 key fingerprint is SHA256:eUXGGm1YGsMAS7vkcx6JOJdOGHPem5gQp4taiCfCLB8.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'gitlab.com' (ED25519) to the list of known hosts.
Enter passphrase for key '/root/.ssh/id_rsa':
PTY allocation request failed on channel 0
Welcome to GitLab, @xxxx!
Connection to gitlab.com closed.

Now we can finally configure Flux to access our private repos. Refer to the following yaml:

root@kp1:~# cat > ~/scripts/flux/secret.yaml << EOF
---
apiVersion: v1
kind: Secret
metadata:
  name: ssh-credentials
type: Opaque
stringData:
  identity: |
    -----BEGIN OPENSSH PRIVATE KEY-----
    ...
    -----END OPENSSH PRIVATE KEY-----
  password: yourpass
  known_hosts: |
    |1|yyyyyyyyyyyyyyyyyyyyyyyyyyy=|xxxxxxxxxxxxxxxxxxxxxxxxxxx= ssh-ed25519 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/nOeHHE5UOzRdf
EOF
root@kp1:~# kubectl apply -f ~/scripts/flux/secret.yaml
secret/ssh-credentials created

Now you can reference a private repo like this:

---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: dns
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  url: https://gitlab.com/hxme/kubes/dns.git

Confirming Flux is Working

The easiest way is to create a repository https://gitlab.com/hxme/kubes/dns.

Create a directory in the repo called ‘src’, and put your Kubernetes manifests in there. For example, a simple namespace from one of my repos:

[]$ cat ../dns/src/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
    name: dns
    labels:
      name: dns

Commit and push it to gitlab.

Create a file in your deploy repository (we just configured this earlier), and add the file https://gitlab.com/hxme/kubes/deploy/dns.yaml with the following content:

---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: dns
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  url: ssh://git@gitlab.com/hxme/kubes/dns.git
  secretRef:
    name: ssh-credentials
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: dns
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./src
  prune: true
  sourceRef:
    kind: GitRepository
    name: dns
  targetNamespace: dns

You should see it successfully pull. If not, it will tell you why.

root@kp1:~# flux get all | grep dns
gitrepository/dns               main@sha1:07d760bd      False           True    stored artifact for revision 'main@sha1:07d760bd'
kustomization/dns               main@sha1:07d760bd      False           True    Applied revision: main@sha1:07d760bd

Configure Gitlab Runners

With all of this configure, it is time to deploy some Gitlab runners.

The best way to do this is to setup the Helm deployment in Flux, because it actually supports that. Fun fact, I got the following configuration from ChatGPT and it works!

Create the file https://gitlab.com/hxme/kubes/deploy/clusters/hxme/gitlab-runners.yaml

---
apiVersion: v1
kind: Namespace
metadata:
  name: gitlab-runners
---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
  name: gitlab-runners
  namespace: flux-system
spec:
  url: https://charts.gitlab.io/
  interval: 5m
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: gitlab-runners
  namespace: gitlab-runners
spec:
  releaseName: gitlab-runners
  chart:
    repository: gitlab-runners
    name: gitlab/gitlab-runner
    #version: <chart_version>  # Specify the version of the GitLab Runner chart
  valuesFrom:
    - secretKeyRef:
        name: gitlab-runners-secret
        key: runnersConfig
  values:
    runnerRegistrationToken: ""
    rbac:
      create: true  # Optionally enable RBAC if needed
  interval: 5m  # Interval at which Flux checks for updates

You’ll notice that we’re referencing a secret here. I’m going to create the secret from the local machine, as I don’t want this secret commited to git. This isn’t the most secure way to do things, but it’s a little less bone headed that putting your secrets into git.

SSH into your K3S node and set a new directory for the secret. Replicate the below:

root@kp1:~# mkdir -p ~/scripts/gitlab-runners
root@kp1:~# cd ~/scripts/gitlab-runners
root@kp1:~/scripts/gitlab-runners# ls
create-md5s.sh  gitlab-runner-config.txt  secrets.yaml
root@kp1:~/scripts/gitlab-runners# cat create-md5s.sh
echo -n 'glrt-xxxxxxxxxxxxxxxxxxxx' | base64
cat gitlab-runner-config.txt | base64 -w0
echo
root@kp1:~/scripts/gitlab-runners# cat gitlab-runner-config.txt
    [[runners]]
      [runners.kubernetes]
        image = "debian:12"
        privileged = true

root@kp1:~/scripts/gitlab-runners# cat secrets.yaml
---
apiVersion: v1
kind: Secret
metadata:
  name: gitlab-runner-secret-hxme
  namespace: gitlab-runners
type: Opaque
data:
  runnerRegistrationToken: Zxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==
  runnersConfig: |
    Zxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=

Replace glrt-xxx... with your gitlab runner token.

Apply the above, and you should see your Helm deploy with Flux succeed (using flux get all).

Give Gitlab Runner access to Private Repositories

Need to deploy a private image during a CICD job? You need to give your GLR access to GL.

You can update our prior scripts, to look like this:

echo -n 'glrt-aZ2m_TuGkL6vVsRMnnsV' | base64
cat gitlab-runner-config.txt | base64 -w0
echo

echo ----
echo

cat > docker-conf.json << EOF
{
  "auths": {
    "registry.gitlab.com": {
      "username": "gitlab-user",
      "password": "glpat-xxxxxxxxxxxxxxxxxxxx",
      "email": "x@googlemail.org",  // Optional: Your email associated with Docker Hub
      "auth": "$(echo -n 'gitlab-user:glpat-xxxxxxxxxxxxxxxxxxxx' | base64)"
    }
  }
}
EOF

cat docker-conf.json | base64 -w0 ; echo

Running this script will output a one-liner base64 encoded string, such as:

---
apiVersion: v1
kind: Secret
metadata:
  name: gitlab-runner-secret-hxme
  namespace: gitlab-runners
type: Opaque
data:
  runnerRegistrationToken: Z2xydC1hWjJtX1R1R2tMNnZWc1JNbm5zVg==
  runnersConfig: |
    ICAgIFtbcnVubmVyc11dCiAgICAgIFtydW5uZXJzLmt1YmVybmV0ZXNdCiAgICAgICAgaW1hZ2UgPSAiZGViaWFuOjEyIgogICAgICAgIHByaXZpbGVnZWQgPSB0cnVlCgo=
  .dockerconfigjson: ewogICJhdXRocyI6IHsKICAgICJSRUdJU1RSWV9TRVJWRVIiOiB7CiAgICAgICJ1c2VybmFtZSI6ICJqN2IiLAogICAgICAicGFzc3dvcmQiOiAiZ2xwYXQtSFBhN0Voc290M1lzRHNuenhFN2EiLAogICAgICAiZW1haWwiOiAiZ2l0bGFiQGo3Yi5uZXQiLCAgLy8gT3B0aW9uYWw6IFlvdXIgZW1haWwgYXNzb2NpYXRlZCB3aXRoIERvY2tlciBIdWIKICAgICAgImF1dGgiOiAiYWpkaU9tZHNjR0YwTFVoUVlUZEZhSE52ZEROWmMwUnpibnA0UlRkaCIKICAgIH0KICB9Cn0K

Now add this to your secret.yaml file and apply it:

---
apiVersion: v1
kind: Secret
metadata:
  name: gitlab-runner-secret-hxme
  namespace: gitlab-runners
type: Opaque
data:
  runnerRegistrationToken: Zxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=
  runnersConfig: |
    Ixxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=
  .dockerconfigjson: exxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

You can repeat the above for each runner that you want to run, changing out your auth tokens as appropriate.

That’s It.

Genuinely, you should have a Flux CD building apps onto your K3S node. Your gitlab runners should… run. And if you need it, longhorn is available (good for MariaDB).

You can use the following as a template for deploying your Kubes manifests:

---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: nextcloud
  namespace: flux-system
spec:
  interval: 1m0s
  ref:
    branch: main
  url: ssh://git@gitlab.com/hxme/kubes/nextcloud.git
  secretRef:
    name: ssh-credentials
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: nextcloud
  namespace: flux-system
spec:
  interval: 10m0s
  path: ./src
  prune: true
  sourceRef:
    kind: GitRepository
    name: nextcloud
  targetNamespace: nextcloud

And all you have to do is put your manifests into the src/ directory of the afforementioned git repo.

Unless you’re using OCI repos, you don’t even need to build anything in gitlab cicd. Flux will handle it all from here.