Convert UFS to ZFS

From Devpit
(Redirected from ZFS on root in Azure)
Jump to: navigation, search

One may convert a machine's root filesystem from UFS to ZFS (without even disrupting `uptime`). While this works fine with onsite hardware, it is particularly helpful with cloud providers where the official FreeBSD images are rooted on UFS. This process runs in a memory disk, however if the root filesystem will not fit in memory or if you want to mirror permanently onto a second disk, then change $extradisk to specify the mirror. Note that in Azure, a disk referred to as "local storage" will be partitioned and newfs'd automatically at boot and therefore isn't suitable.

Besides changing $extradisk and $newpool, this is intended to be handy with copy-paste. However, it is intended to be walked through manually and with care; if you don't understand how to adjust it for your situation, it will probably destroy your data. If you've installed any stateful servers such as databases, you'll need to stop them before doing this because tar is not atomic.

which sudo >/dev/null && exec sudo sh || exec su -l root -c 'exec sh'
newpool="$(hostname -s)"
extradisk="$(mdconfig -a -t swap -s 1T)"
bootdisk=$( (test -e /dev/ada0 && echo ada0) || (test -e /dev/da0 && echo da0) || (test -e /dev/xbd0 && echo xbd0) || (test -e /dev/nvd0 && echo nvd0) || echo oops )
umount /dev/${extradisk}[ps]*
gpart destroy -F ${extradisk}
gpart create -s gpt -n 152 ${extradisk}
gpart add -t freebsd-boot -b 40 -s 1090 -l ${newpool}.boot1 ${extradisk}
tmpsize=$( (gpart show -lp ${bootdisk} | egrep '^=>' | (read junk offset size junk && echo $((size-1048576)) ); gpart show -lp ${extradisk} | egrep '^=>' | (read junk offset size junk && echo $((size/2-1048576)) ) ) | sort -n | head -n 1 )
gpart add -t freebsd-zfs -b 256M -s ${tmpsize} -l ${newpool}.zfs1 ${extradisk}
gpart add -t freebsd-zfs -s ${tmpsize} -l ${newpool}.zfs0 ${extradisk}
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ${extradisk}
kldload zfs
sysctl vfs.zfs.min_auto_ashift=12
zpool create -fo altroot=/xmnt -o autoexpand=on -O mountpoint=/ -O canmount=off -O atime=off -O compression=lz4 -O recordsize=1M -O redundant_metadata=most -O com.sun:auto-snapshot=true ${newpool} mirror /dev/gpt/${newpool}.zfs?
zfs create -o recordsize=128K ${newpool}/.
zpool set bootfs=${newpool}/. ${newpool}
zpool offline ${newpool} /dev/gpt/${newpool}.zfs0
gpart delete -i3 ${extradisk}
gpart resize -i2 ${extradisk}
zpool online -e ${newpool} /dev/gpt/${newpool}.zfs1
tar --one-file-system -C / -cpf - . | tar -C /xmnt -xpf -
rm -d /xmnt/boot/zfs/zpool.cache /xmnt/xmnt
egrep -v '^/dev/[^[:space:]]+[[:space:]]+/[[:space:]]' /etc/fstab > /xmnt/etc/fstab
echo 'zfs_load="YES"' >> /xmnt/boot/loader.conf
echo 'zfs_enable="YES"' >> /xmnt/etc/rc.conf
zpool export ${newpool}
rmdir /xmnt
kenv vfs.root.mountfrom=zfs:${newpool}/.
reboot -r

Up to this point, had something gone awry you need only reboot to put it back (unless you clobbered the wrong device). The "reboot -r" command merely moved the root filesystem to the new zpool while leaving the real boot disk intact. If you rebooted (by hardware or by the "reboot" command), the machine would return to the filesystem on the original boot disk. When you're happy that everything is as it should be, then use the following to overwrite the previous boot disk as a mirrored member of the new zpool.

which sudo >/dev/null && exec sudo sh || exec su -l root -c 'exec sh'
newpool=$(hostname -s)
bootdisk=$( (test -e /dev/ada0 && echo ada0) || (test -e /dev/da0 && echo da0) || (test -e /dev/xbd0 && echo xbd0) || (test -e /dev/nvd0 && echo nvd0) || echo oops )
gpart destroy -F ${bootdisk}
gpart create -s gpt -n 152 ${bootdisk}
gpart add -t freebsd-boot -b 40 -s 1090 -l ${newpool}.boot0 ${bootdisk}
gpart add -t freebsd-zfs -b 256M -l ${newpool}.zfs0 ${bootdisk}
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ${bootdisk}
zpool replace ${newpool} /dev/gpt/${newpool}.zfs0
zpool detach ${newpool} /dev/gpt/${newpool}.zfs0/old
zpool online -e ${newpool} /dev/gpt/${newpool}.zfs0
zfs set refquota=8G ${newpool}/.
zfs set refreservation=8G ${newpool}/.
while true; do zpool status ${newpool} | egrep ', .+% done' || break; sleep 1; done
# If $extradisk was an md, then run zpool detach ${newpool} /dev/gpt/${newpool}.zfs1; mdconfig -d -u md0

Notes

While it's spiffy that this doesn't reboot, it may be wise to reboot after anyway to be sure your VM will in fact reboot correctly. However this would merely be a test, so if your primary objective is to win silly uptime games, then don't.

I hear one should set host caching off. I would like to know why.

Regarding su/sudo sh, Amazon's image requires su while Azure's image requires sudo sh. Arranging as shown allows the same "script" to work in either cloud.

One can't simply read fstab to figure out the boot disk. At least on AWS and Azure, it lists references such as /dev/gpt/rootfs instead of actual disks. The Azure VMs I used wanted da0, and the Amazon VMs wanted ada0 (which was remapped from xbd0). Assuming the first of ada0 or da0 is quick and dirty, but easy to adjust. More sophistication is beyond the scope of this.