Rebuilding the homelab: A Rocky start
Published on , 1580 words, 6 minutes to read
This content is exclusive to my patrons. If you are not a patron, please don't be the reason I need to make a process more complicated than the honor system. This will be made public in the future, once the series is finished.
My homelab consists of a few machines running NixOS. I've been put into facts and circumstances beyond my control that make me need to reconsider my life choices. I'm going to be rebuilding my homelab and documenting the process in this series of posts.
Join with me in my tale of woe as I find out how good I really had it with NixOS.
Each of these posts will contain one small part of my journey so that I can keep track of what I've tried and you can follow along at home.
State of the lab
My homelab is made up of a few machines:
kos-mos
, a small server that I use for running some CI things and periphery services. It has 32 GB of ram and a Core i5-10600.ontos
, identical tokos-mos
but with an RTX 2060 6 GB.logos
, identical tokos-mos
but with a RTX 3060 12 GB.pneuma
, my main shellbox and development machine. It is a handbuilt tower PC with 64 GB of ram and a Ryzen 9 5900X. It has a GPU because the 5900X doesn't have integrated graphics. It has a bunch of random storage devices in it.itsuki
, the NAS. It has all of our media and backups on it. It runs Plex and a few other services, mostly managed by docker compose. It has 16 GB of ram and a Core i5-10600.chrysalis
, an old Mac Pro from 2013 that I as my Prometheus server. It has 32 GB of ram and a Xeon E5-1650. Coincidentally, it also runs the IRC bot in#xeserv
on Libera.chat.
Of these machines, kos-mos
is the easiest to deal with because it's not running much of anything useful right now. I had to move some workloads off of it for various reasons that are coming back to be useful. ontos
has a bunch of models on it that it quantizes from time to time. pneuma
is a very hard target to move because I am actively using it at all times.
I am not going to touch the NAS in this first phase. I am afraid of losing data and it stores more data than the rest of the storage devices in the house combined.
A rebuild of the homelab is going to be a fair bit of work. I'm going to have to take this one piece at a time and make sure that I don't lose anything important.
The plan
I'm not sure which technology/distro I'm going to end up with. For various reasons, I'm probably gonna end up with something in the RPM ecosystem. Say what you will about Red Hat, but they're probably not going anywhere any time soon.
Here's the short list of things I'm considering:
- Rocky Linux (or even Oracle Linux) with Ansible
- Something in the Universal Blue ecosystem
- Fedora CoreOS with Kubernetes
Wait, hold up. You're considering Kubernetes for your homelab? I thought you were as staunchly anti-Kubernetes as it got.
I am, but hear me out. Kubernetes gets a lot of things wrong, but it does get one thing so clearly right that it's worth celebration: you don't need to SSH into a machine to look at logs, deploy new versions of things, or see what's running. Everything is done via the API.
Things really must be bad if you're at this point...
Of these things, Rocky Linux and Ansible are probably the easiest things to start with. After running a poll, it looks like the masses want Rocky Linux. Who am I to not give the people what they want?
Rocky Linux
Rocky Linux is a fork of pre-Stream CentOS. It aims to be a 1:1 drop-in replacement for CentOS and RHEL. It's a community-driven project that is sponsored by the Rocky Enterprise Software Foundation.
For various reasons involving my HDMI cable being too short to reach the other machines, I'm gonna start with chrysalis
. Rocky Linux has a GUI installer and I can hook it up to the sideways monitor that I have on my desk.
The weird part about chrysalis
is that it's a Mac Pro from 2013. Macs of that vintage can boot normal EFI partitions and binaries, but they generally prefer to have your EFI partition be a HFS+ volume. This is normally not a problem because the installer will just make the EFI partition for you. However, the Rocky Linux installer doesn't do that. They ifdeffed out the macefi stuff because Red Hat ifdeffed it out.
I get that they want to be a 1:1 drop-in replacement (which means that any bug RHEL has, they have), but it is massively inconvenient in this case in particular.
I found this out when I tried to install Rocky Linux and it failed at the partitioning step with this error:
As a result, you have to do a very manual install that looks something like this lifted from the Red Hat bug tracker:
- Boot Centos/RHEL 8 ISO Normally (I used 8.1 of each)
- Do the normal setup of network, packages, etc.
- Enter disk partitioning
- Select your disk
- At the bottom, click the "Full disk summary and boot loader" text
- Click on the disk in the list
- Click "Do not install boot loader"
- Close
- Select "Custom" (I didn't try automatic, but it probably would not create the EFI partition)
- Done in the top left to get to the partitioning screen
- Delete existing partitions if needed
- Click +
- CentOS 8: create /boot/efi mountpoint, 600M, Standard EFI partition
- RHEL 8: create /foo mountpoint, 600M, Standard EFI partition, then edit the partition to be on /boot/efi
- Click + repeatedly to create the rest of the partitions as usual (/boot, / swap, /home, etc.)
- Done
- During the install, there may be an error about the mactel package, just continue
- On reboot, both times I've let it get to the grub prompt, but there's no grub.cfg; not sure if this is required
- Boot off ISO into rescue mode
- Choose 1 to mount the system on /mnt/sysimage
- At the shell, chroot /mnt/sysimage
- Check on the files in /boot to make sure they exist: ls -l /boot/ /boot/efi/EFI/redhat (or centos)
- Run the create the grub.cfg file: grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
- I got a couple reload ioctl errors, but that didn't seem to hurt anything
- exit
- Next reboot should be fine, and as mentioned above it'll reboot after SELinux labelling
Yeah, no. I'm not going to do that. Another solution I found involved you manually booting the kernel from the GRUB rescue shell. I'm not going to do that either.
So, that's a wash. In the process of figuring this out I also found out that when I wiped the drive, I took down my IRC bot. I'm going to have to fix that eventually.
Ansible
As a bonus round, let's see what it would look like to manage things with Ansible on Rocky Linux should I have been able to install Rocky Linux anyways. Ansible is a Red Hat product, so I expect that it would be the easiest thing to use to manage things.
Ansible is a "best hopes" configuration management system. It doesn't really authoritatively control what is going on, it merely suggest what should be going on. As such, you influence what the system does with "plays" like this:
- name: Full system update
dnf:
name: "*"
state: latest
This is a play that tells the system to update all of its packages with dnf. However, when I ran the linter on this, I got told I need to instead format things like this:
- name: Full system update
ansible.builtin.dnf:
name: "*"
state: latest
You need to use the fully qualified module name because you might install other collections that have the name dnf
in the future. This kinda makes sense at a glance, I guess, but it's probably overkill for these usecases. However, it makes the lint errors go away and it is fixed mechanically, so I guess that's fine.
What's not fine is how you prevent Ansible from running the same command over and over. You need to make a folder full of empty semaphore files that get touched when the command runs:
- name: Xe's semaphore flags
ansible.builtin.shell: mkdir -p /etc/xe/semaphores
args:
creates: /etc/xe/semaphores
- name: Enable CRB repositories
ansible.builtin.shell: |
dnf config-manager --set-enabled crb
touch /etc/xe/semaphores/crb
args:
creates: /etc/xe/semaphores/crb
And then finally you can install a package:
- name: Install EPEL repo lists
ansible.builtin.dnf:
name: "epel-release"
state: present
This is about the point where I decided:
No. I'm not going to deal with this.
I haven't even created user accounts yet, I'm just trying to install a package repository so that I can install other packages.
So I'm not going with Ansible, even on the machines where installing Rocky Linux works.
Facts and circumstances may have changed since publication. Please contact me before jumping to conclusions if something seems wrong or unclear.
Tags: homelab, RockyLinux