The stack
- Longhorn: Persistent storage for Kubernetes
- Velero: Backup tool for Kubernetes and Longhorn
- Uptime Kuma: Alerting for failed backup jobs
Longhorn
I’ve been using longhorn as an easy foray into K8s storage that provides some redundancy without the complexities of Ceph. I will soon dive into Ceph but Longhorn has less strict hardware requirements (e.g. can use virtual disks on VMs).
Backups
Many of my workloads have data I care to backup. I tried K10/Kasten but found it excessively complex and finicky. The last straw is being proprietary software which I cannot rely on (I avoid using proprietary software wherever possible).
Velero
Velero, comparatively, was easy to setup and is reliable. The lack of a GUI means the CLI tooling is first class; the dedicated CLI tool (thankfully available on nix packages) is a nice wrapper to kubectl
.
Installing Velero
I use Kustomize wherever possible:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- storageclass.yaml
- backuppvc.yaml
- cron-backupcheck.yaml
helmCharts:
- name: velero
repo: https://vmware-tanzu.github.io/helm-charts
releaseName: velero
namespace: velero
version: 10.1.1
valuesFile: helm-values.yaml
The backuppvc
config is used to create only a single replica during backup jobs. Unfortunately, Longhorn currently (as of 2025) duplicates the volume that is being backed up. If the storage class has a default replica count set (as mine does) that’s a lot of extra IO. We can mitigate this via this ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: backuppvc
namespace: velero
data:
config: |
{
"backupPVC": {
"longhorn": {
"storageClass": "velero-storageclass-noreplica",
"readOnly": false
}
}
"loadConcurrency": {
"globalConfig": 1,
"perNodeConfig": [
{
"nodeSelector": {
"matchLabels": {
"kubernetes.io/os": "linux"
}
},
"number": 1
}
]
}
}
Of course, that storageclass needs to exist as well.
Finally, there’s some configuration options I’m setting via helm (some redacted):
deployNodeAgent: true
nodeAgent:
extraArgs:
["--node-agent-configmap=backuppvc"]
snapshotsEnabled: true
backupsEnabled: true
credentials:
useSecret: true
existingSecret: minio-creds
configuration:
features: EnableCSI
uploaderType: kopia
defaultItemOperationTimeout: 8h
defaultSnapshotMoveData: true
backupStorageLocation:
- name: default
provider: aws
bucket: k8sbackups
config:
region: minio
s3ForcePathStyle: true
s3Url: http://minioip:9000
volumeSnapshotLocation:
- name: default
provider: csi
config:
csiDriverName: driver.longhorn.io
snapshotClass: longhorn-snapshot
initContainers:
- name: velero-plugin-for-aws
image: velero/velero-plugin-for-aws:v1.12.0
volumeMounts:
- mountPath: /target
name: plugins
minio runs on a host used as a Proxmox Backup Host as a systemd service.
Monitoring
Backups are of limited use if they aren’t monitored. The script and associated config are potentially too dirty to post publicly, but in essence it boils down to:
- Create a
ServiceAccount
,ClusterRole
(with permissions on thevelero.io
andlonghorn.io
api groups), andClusterRoleBinding
to tie them together - Create a cronjob using said service account that runs a bash script using the
alpine/k8s:1.33.4
container (which helpfully haskubectl
andcurl
) - Create an Uptime Kuma
Push
Monitor
The cron job will run a script that essentially get all velero backups in the last 23 hours with kubectl
. If all are successful, it will using curl to GET the Uptime Kuma url. If any failed, it will not send a GET, causing the monitor to alert me via email. Uptime Kuma is hosted on PikaPods so there’s no local infrastructure dependencies. It is the same organization that runs BorgBase which I’ve been a happy customer of for many years to store off site Borg backups (locally encrypted of course).