[plug] Filesystems for lots of inodes

Sat Jan 4 12:40:39 AWST 2020

G'day All,

I have a little backup machine that has a 4TB drive attached.

Every night it logs into all my machines and does a rotating hardlink 
rsync to back them up.

Currently there are about 36 directories and each of those has ~105 
hardlinked backups.

This is functioning well and has done since I put it in place back in 
2016. When the drive gets below 50G free, the script goes and prunes the 
first backup from each directory and keeps doing that until it has more 
than 50G free.

The problem (such as it is) is the drive is now heavily fragmented. 
Quite seriously it would appear.

I've occasionally run a du -hcs * on the main directory, and it can take 
12-18 hours to actually complete.

I see a couple of solutions.
- I can dump/restore the filesystem to defragment it. That'll buy me 
another couple of years to figure out a new solution.
- If I'm doing a dump/restore, I could back it up and restore it onto a 
new filesystem.
- I can dump/restore onto a 4TB SSD. They're still a bit spendy, but in 
the scheme of things $800 is doable.

So, I've dumped the drive onto a file on my main server and mounted that 
loopback. I've set up a temporary RAID5 to do investigation on with the 
following 3 tests :
1) A bit for bit copy of the FS and time du -hcs *. I started that at 
midnight last night and it's only half way through.
2) A dump/restore to a clean ext4 and repeat the time du -hcs *
3) A copy onto a clean xfs and repeat the time du- -hcs *

I've got exactly zero experience with xfs, but most of the digging I've 
done seems to indicate it might be better than ext4 in the long run.

This is an absolutely pathological use case with usage being probably as 
much storage used as inodes as actual data (there are currently 244 
million inodes in use).

Given the rotation strategy, that would indicate we remove and add about 
2.4 million inodes every week or so, which over 182 weeks would explain 
the severity of the fragmentation.

The reality is the backups haven't actually slowed down much. A copy -al 
takes about the same time it always has. The rm to prune the backups is 
taking a bit longer, but any operation requiring real traversal of the 
filesystem (such as a du or rsync replication of the whole drive) is 
painful.

So, does anyone have any experience with moderate sized filesystems that 
are almost entirely comprised of hardlinks and how well they perform 
over time with fragmentation?

Brad
-- 
An expert is a person who has found out by his own painful
experience all the mistakes that one can make in a very
narrow field. - Niels Bohr