- make UEFI specifc iPXE chainloadable firmware
- Configure DHCP server to manage both BIOS and UEFI machines
- Add "initrd=initrd.img" to the kernel kickstart parameters
- (Disable STP for the port of the the iPXE booting host)
PXE booting is cool. When you manage more than a handful of servers you really want to automate. PXE booting allows your DHCP server tell your nodes how they should install.
The basic PXE booting stack is PXE (on the booting node), a DHCP server and TFTP (Trivial FTP). DHCP points the booting node to the TFTP server. The TFTP server contains files which tell the machine what to do. Sounds good? Meh.. it's fine. What it lacks is real automation. TFTP serves files. Files are static. This makes dynamic decisions hard.
Enter iPXE. The iPXE boot firmware is a more fully featured PXE client. For me, the coolest feature is the ability to boot over HTTP instead of TFTP. That means you can point it to a script that does on the fly decisions. Should the machine be reinstalled? If so with what profile? You can do pretty much anything.
Loading the iPXE firmware
There are few ways to use the iPXE firmware. You can flash your network card with the iPXE firmware, and use it automatically. I'm not a huge fan of this. Patching third party stuff to a network card isn't too fun. And what happens with new machines? You need to patch the card to be able to bootstrap the server, but to flash the card you need to bootstrap the server, but then you need to flash...
The other option is nicer. You can tell the standard PXE bootloader to load the iPXE bootloader over TFTP, and then use the iPXE bootloader to actually boot. This is called chainloading. You need some logic in your DHCP server, but it makes things easier to manage.
We got some new servers that I tried to install. Alas, they just didn't want to do anything with the iPXE bootloader. It took me a while to realize that they use UEFI. So they have the "new" BIOS replacement. After some extensive googling (well duckduckgoing), I found out that, of course, UEFI needs separately built (per architecture) iPXE bootloaders. It took a while to figure out what to actually build (the documentations is a bit, uhm, lacking). The magic command was
This generates UEFI capable PXE chainloadable goodness (for x86_64).
This again makes the "normal" BIOS chainloadable firmware.
So UEFI machines should load one file, BIOS machines should load another file. Must the DHCP server be preconfigured with per-node boot decisions? Apparently there is a better way. I found this nice github page with the logic to manage PXE chainloading. That should take care of the logic regardless if the booting server uses BIOS or UEFI.
I happened to hit STP (Spanning Tree Protocol) issues too. I got iPXE loaded on the UEFI machine but it never got a DHCP reply. Actually, when TCP dumping it never even sent a request! Can UEFI booting really be this hard? Well, during the boot I entered the iPXE command prompt (Ctrl-B) to debug the issue. I noticed that after a while I always got an IP with DHCP. After debugging it seems STP was on for the switch ports. This made them slow to forward traffic (the DHCP requests), so DHCP failed. Turning STP off for those ports fixed the issue.
Installing CentOS 7
Ah, so finally, I can point the machine to an URL and kickstart it with CentOS 7. So text scrolling, going forward and BAM
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
Uhm, okay. But sigh. Some amount of debugging later I found this
So apparently with UEFI loaded kernels you need to specifically give the kernel the initrd param too. I'm sure that makes sense in some world, and has a technical reason. Anyway, just load the kernel and initrd with iPXE normally, and append this to the kernel params
Aaand done! My first UEFI iPXE chainloaded server kickstarting!