The more I read about containers, the more clear it is that btrfs
is the future filesystem for DevOps work. This is because btrfs'
copy-on-write behavior means that provisioning the disk for a new
container is just a 'btrfs snapshot' operation under the hood, and
only the blocks that I change use additional space. I've been working
more with LXD, which enables this behavior when it detects that
/var/lib/lxd is on a btrfs filesystem.
I've read some articles, but it's time to take the plunge. The
candidate system for this is my work laptop, since I like to run some
local containers. Also, unlike my personal laptop, the disk isn't
encrypted, so there's a layer I don't have to reason about.
The first time I considered this, I planned how to do it the old
painful way: Install a new OS onto btrfs on a free partition, then
over time re-install everything and migrate data from the old
partition as needed. Then when I was satisfied that I had everything
I needed from the old install, I could expand onto the old partition
using btrfs' RAID 0 capability.
The thing is, btrfs provides a new way: the btrfs-convert utility
can turn an ext2/3/4 filesystem into a btrfs filesystem in place.
This works since btrfs doesn't put its superblocks in fixed places, so
it can put the new superblocks around the old ones. The tool even has
a --rollback flag if you want to go back to ext2/3/4 after the
conversion. I'm committed enough to my current setup that this
conversion seems like the best way forward.
Prep work is done
The target machine already has a btrfs-aware kernel, which I know
from some toy filesystems created in loopback files.
GRUB 2 is installed. I expect some futzing will be required to
make GRUB understand btrfs, but I've configured GRUB modules before
(for LVM I think).
I have System Rescue CD on a thumb drive (at all times, on my key
ring!) I also have an external USB drive with much free space.
The plan is to boot the machine with the rescue distro, then back
up the whole disk to the external USB drive using dd. I'll run
btrfs-convert
, then adjust my fstab and bootloader as
needed.
I need the machine for work tomorrow, so I have to be fully
rolled-forward or fully rolled-back by morning. Let's go!
Insurance
I boot into System Rescue CD and attach the external drive. One
partition has 1.5T free, more than enough for my dd operation.
mount /dev/sdc1 /mnt/gentoo
cd /mnt/gentoo
dd if=/dev/sda of=system76bak.img
In another terminal (Alt-F2) I used
watch ls -hl
/mnt/gentoo/system76bak.img
to view the copy. It's also
possible to periodically
kill -USR1 2431
to cause the dd
process (the last arg) to output its progress. This took two and a
half hours for 250G on eight cores, but is worth every minute for the
peace of mind.
Conversion
System Rescue CD has btrfs-progs version 3.18.2. This is good
enough for me, since 3.14.2 is the latest in Gentoo's stable branch.
I just didn't want to use a super-old btrfs-convert, and I'm satisfied
I won't.
It was as simple as
btrfs-convert /dev/sda2
I'm a little concerned that the original filesystem was at 93%
capacity. I can clear off a few gigs here and there if I need to, but
certainly a disk-nearly-full condition is a dealbreaker for an
operation like this. I'm counting on the rollback flag in that case.
It will help if deduplication is part of this process.
Of course I only saw the -p (show progress) flag after hitting
enter. This could take hours, yes? I'll never know now.
Full output:
root@sysresccd /root % btrfs-convert /dev/sda2
creating btrfs metadata.
creating ext2fs image file.
cleaning up system chunk.
conversion complete.
Observations:
- Done in 32 minutes without error!
- The old filesystem has been made available at /ext2_saved/image (178G)
- The new filesystem shows 94% full now (was 93% before the conversion). That's a remarkably efficient operation.
Make it bootable
First I'll fix /etc/fstab, the easy part right? blkid
now gives me a UUID="..." and a SUB_UUID="..." I'm sure the subvolume
ID is valid as a mount point, but let's confirm this on the web. Both
Arch and Gentoo wikis state UUID, and both remind me that the last
field should be 0 to disable fsck on boot.
Now let's update GRUB. We'll do it from within, so chroot in the normal way.
mount --rbind /dev /mnt/gentoo/dev
mount --rbind /proc /mnt/gentoo/proc
chroot /mnt/gentoo /bin/bash
Re-install GRUB
grub2-install --modules=btrfs /dev/sda
There were some "device node not found" messages, but at the end it
claimed to encounter no errors.
Finally, let's make sure the grub.cfg has the UUID for the correct volume
grub2-mkconfig -o /boot/grub/grub.cfg
Again, more "device node not found" messages. But let's take our
chances and reboot, since it might just work now.
And, voila! It booted immediately into the converted filesystem
the first time. I had to check the output of "mount" to be sure that
it was really working. Well done, btrfs devs, on making the
conversion as intuitive and painless as possible!
Postscript
The following day at the office, the system twice hung on disk I/O
to the point of requiring a hard reboot. I could, for instance, type
in a terminal until I did something that required a disk read
(e.g. attempt a tab completion), and then that terminal was hung,
until they all were.
I had the discard mount option enabled, and have disabled it since
reading some warnings about the discard action fully blocking the
disk, which sounds a lot like the hang I was experiencing. I'll have
to do manual TRIMs. I've also enabled the ssd mount option (which is
unrelated to discard). Let's hope for no more hangs!
Somewhat unrelated, a co-worker recommended ncdu, that is "ncurses du", for
the problem of quickly identifying what's using all your disk space.
I later freed up over 100G - if I knew it was so easy, I'd have done
it before my dd backup.