For the past few months, I've been trying to make it work (mostly because of the raid + snapshotting functionality), but I have ran into nothing but trouble.
Frequent hangs of my system (unrelated to the defrag process that runs daily, which also uses 100% CPU for minutes -- the best I have deduced it towards is either because of using raid+nvme, or because of high (>100GB) RAM)), Docker is very unstable using btrfs [2], etc etc.
I moved back to ext4 + mdraid just last week and couldn't be happier.
I've been using it for years, including with Docker, and haven't had any problems. There are people successfully using Btrfs with tens of thousands of containers, e.g.
Why are you using defrag daily? Why aren't you using the autodefrag mount option instead if you really need such frequent defragging?
Really anytime there's a problem in the kernel, you need to try the workload with a mainline kernel and report the problem to the upstream kernel list if you can reproduce it. If you can't reproduce the problem with mainline, then you have to take up the bug with your distro. That's the way it is with everything, not just Btrfs.
Ergo, I think asking for technical help about Btrfs on serverfault or stack exchange or even HN is weird. People having Btrfs problems need to go to directly to the upstream list:
Even weirder is the serverfault user has SLES! He has a support contract with SUSE so why post in serverfault? It just makes zero sense to me to do that...
And the github link, the OP was asked for more information as it sounded like not a Docker problem at all, and no followup response.
> Really anytime there's a problem in the kernel, you need to try the workload with a mainline kernel and report the problem to the upstream kernel list if you can reproduce it. If you can't reproduce the problem with mainline, then you have to take up the bug with your distro. That's the way it is with everything, not just Btrfs.
Sure, if I have a bug with program X that's patched by distro Y, that's the normal support path.
I think there's a couple issues with applying that same logic to a filesystem that's supposedly stable enough to be used as a root FS though.
First, a filesystem should generally be stable enough that patches applied by a distro don't completely ruin it. How many bugs have there been in ext4 or xfs that were specific to one distro and their kernel patches? They're certainly possible, but I would think that pre-release testing would catch the vast majority of them. Red Hat dropping support of btrfs was a big vote of no confidence here, because it's not just lack of support for RHEL users, it implies lack of testing efforts even for Centos & Fedora users.
Second, if I'm having issues with my root filesystem, it's a bit of a crapshoot as to whether the system is stable enough to compile a mainline kernel and try to reproduce the bug while running that.
And finally, I simply don't want bugs in my root filesystem. I don't even want to get to the point of pondering if I should send the bug report to my distro or to the mainline kernel. I want my root filesystem to be a thing that Just Works, without having to think about it.
>First, a filesystem should generally be stable enough that patches applied by a distro don't completely ruin it.
File systems are sufficiently complicated only developers with expertise in a particular file system will be applying patches. Red Hat has device-mapper, LVM, ext4 and XFS developers with such expertise, but not Btrfs developers. That's the reason why they dropped it.
> it implies lack of testing efforts even for Centos & Fedora users
I can't parse that.
> And finally, I simply don't want bugs in my root filesystem
This is both naive and a reasonable request. It's naive in that they all have bugs, users find them, they report them, they get fixed. Happens all the time. It's also reasonable to pick a file system you think will have the least problems for your use case, if you're not interested in being a bug reporter.
I've been using Arch Linux mostly, with the 4.18 and more recently 4.19 kernels. As explained in another post, I've done extensive trial & error to isolate the issue.
I understand that it's just a single data point, and other people have more positive experiences. My intention was not to find a solution here on HN, but OP asked for experience reports so I gave mine.
Using Arch, and such recent kernels, and having isolated the issue, you should report it upstream. Otherwise it sounds like you have the time and preference to report negative experiences on HN rather than see the problem get fixed.
Just look at the btrfs commits in each new Linux version - the critical fixes never stop. And most of them are direct corner case fixes without systematic cleanup. The code seems to consist of corner cases without a robust framework to perform difficult operations.
This is not the case. I've done a lot of trial & error, and tried Ubuntu 18.04 / 18.10, Mint and Arch. The problems persist everywhere.
Best to my knowledge, it started when I upgraded from 32GB RAM to 128GB RAM, and/or started using virtual machines more intensively after that. I tried tuning btrfs in various ways, disabling CoW, enabling/disabling autodefrag. The problems start to re-appear and intensify after about one week of using a fresh install.
All I know it's definitely not something simple like "use the latest kernel".
Frequent hangs of my system (unrelated to the defrag process that runs daily, which also uses 100% CPU for minutes -- the best I have deduced it towards is either because of using raid+nvme, or because of high (>100GB) RAM)), Docker is very unstable using btrfs [2], etc etc.
I moved back to ext4 + mdraid just last week and couldn't be happier.
1 https://serverfault.com/questions/747366/btrfs-write-operati...
2 https://github.com/moby/moby/issues/34501