Re: filesystem full showing -29G
- In reply to: Matthias Apitz : "filesystem full showing -29G"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 26 Oct 2024 08:14:55 UTC
On 10/25/24 23:12, Matthias Apitz wrote: > Hello, > > I've a bunch of external USB disks to where I copy every month the > following three files, last on September 20: > > -r-------- 1 root wheel 70G 20 sept. 10:34 guru-20240920.tar.gz > -r--r----- 1 root wheel 62B 20 sept. 10:44 guru-20240920.tar.gz.md5 > -r--r----- 1 root wheel 59M 20 sept. 10:52 guru-20240920.tar.gz.lst > > The suffixes say what they contain, esp. the MD5 hash of the tar > archive. > > The yesterday's copy ended up with no space left on device. The curious > thing is that it shows -29G: That should have likely only happened if transferring it as root or if limits were adjusted after the transfer. Unless you had an error during the transfer, you should expect it went through all the way. You will need to delete data if you need. > # df -kh /mnt/backups/ > Filesystem Size Used Avail Capacity Mounted on > /dev/da0p4 652G 629G -29G 105% /mnt > > How is this possible? I presume this is with UFS though people also use other choices like exfat, ntfs, and even ZFS on external drives. Knowing the filesystem could help know what properties it has. Filesystems usually have reserved free space. Performing activities as root will normally allow you to exceed the normal reserve but even then its common that a little bit of room is kept for some of its own housekeeping. Filesystems like ZFS also require having available space to write to when a delete is to take place; if ZFS becomes full, it could become difficult to impossible to get it unstuck unless the pool can be grown (larger partition/disk). Even root is supposed to not be allowed to use such final amounts of free space. Following OpenZFS bug reports shows users do still sometimes reach such a condition and if it occurs then detailed bug reports are important. Separately, ZFS in particular can lead you down unexpected results if using external tools that are not ZFS aware like df; they make assumptions that are not true on ZFS like the size of the partition not changing. > I will later re-calculate the MD5 hash of the last tar archive guru-20240920.tar.gz > and compare it with what is stored in guru-20240920.tar.gz.md5 If using md5 as a verification against corruption, gzip also contains crc32 checksum though a second check with a second algorithm can further limit a chance of different data having the same hash. If it is to identify tampering then there are other more secure algorithms you might consider in place of md5 or use public/private key signing. If you have the source, you can try comparing the source and destination size with `ls -l` and without the -h parameter to get a more precise byte count. You could also use a tool like jdupes to compare source vs destination byte for byte. Some separate considerations to make backups and restores faster and using less resources: Unless you need compatibility with older systems that have limited archive format support, you could consider using compressors that can make smaller archives more quickly. With gzip you may be CPU bottlenecked for writing the compressed archive. During a copy your USB drives are either limited by the drive speed or USB speed for writing and reading; any additional compression ratio = data moves to and from the drives faster. Using zstd will usually allow you to compress data smaller while using less CPU and extracting should run much faster. If these archives are created and copied regularly to store mostly the same data with just incremental changes, you could evaluate if other archival tools that support incremental archiving are better suited like zpaqfranz. Such tools can help only store+transfer what has changed, store multiple revisions without taking the full space for each copy and deduplicate data in case of multiple copies of the same file end up in it. If using ZFS for the original and backup, you could use zfs replication which has several differences. You can perform incremental transfers and only changed blocks within the filesystem will be read and transferred. ZFS compressed files can be transferred without decompressing/recompressing or you can do so to make them smaller on the destination; Restoring will then have less data to read from disk so it restores faster and again can be transferred with or without recompressing so your source disk can get the additional space savings. Higher compression likely requires a bit more RAM when files are read but on the other hand ZFS ARC (=RAM cache) will hold the compressed version of the files so your cache can hold more file data. ZFS with zstd compression benefits from one thread per block allowing muntithreaded decompression which zstd otherwise doesn't have yet for the standalone program. If you received it to a ZFS pool instead of storing the replication stream into a file, you will have quick access to any file in the backup without having to decompress/extract anything more than what you need. Unfortunately compressing each ZFS record (usually 128k) separately usually performs worse than compressing the files as a whole using a comparable compressor and archiving multiple (preferably similar) files gives even better results. > Thanks > > matthias >