Re: What is the best way to look for a lost file in the disk blocks
Date: Wed, 10 Aug 2022 18:05:10 UTC
On 10 August 2022 5:26:27 pm AEST, Matthias Apitz <guru@unixarea.de> wrote: > El día miércoles, agosto 10, 2022 a las 07:18:03a. m. +0200, Michael > Schuster escribió: > > > On Wed, Aug 10, 2022 at 3:55 AM David Christensen > > <dpchrist@holgerdanske.com> wrote: > > > > > > On 8/9/22 05:23, Matthias Apitz wrote: > > > > > > > > Hello, > > > > > > > > Last night I damaged a plain UTF-8 HTML file (I copied by > accident a > > > > JPEG file over it) and it turned out that the backup was done a > month > > > > ago. I learned my lesson from this re/ doing backups more often > of files > > > > I'm working on... > > Thanks for the hints. > > The file in question is my diary, written in Spanish and every day > is headed by a line like > > <dt><b>Viernes, 29 de julio de 2022 </b> > > So I wrote a 35 line C-programm reading any 1024 byte block from the > device, terminate it with '\0' to make sure that a > > char *p = strstr(block, " de 2022 </b>"); > > would not fail, and with p != NULL I printed with printf(p-16); > the diary entry; and the > current block number to be used in dd(1) later. > It finds all the lines of this year, but not the missing between July > 10 > and August 1 :-( > So the blocks have been lost. I was hoping that UFS puts them back to > free block chains for later use, but it seems that > the 'cp picture.jpg diary.html' directly overwrote the used blocks. > > Lesson learned. I'm attaching the C-pgm, maybe someone can use it or > at > least its idea. > > matthias "Necessity is the mother of Invention" alright. A neat solution. Could any other files written since have reused those blocks? I'm a little surprised if the cp did that ... FWIW, I was about to offer a different method that came from my own need - finding a small but rare string in the 12.3-RELEASE dvd1.iso to be replaced, so that the 2+GiB of included packages may be installed - after 3 patches to bsdconfig, but that's another story - so I'll share it as it could be used on each (say) 10MiB block dd'd from a disk or partition as well. play.iso is a copy of the 4.1GiB dvd1.iso <code> smithi@t430s:/home/dvds % strings -an7 -td play.iso | grep -i2 'pkg.txz' 2442269512 sod.J{++I 2442271727 %R:*lAS 2442277052 PKG.TXZ;1PX, 2442277146 pkg-1.17.2.txzNM 2442277165 pkg.txz 2442278912 version = 2; 2442278925 packing_format = "txz"; -- 4377882256 Signature type %s is not supported for bootstrapping. 4377882310 %s/%s.pubkeysig.XXXXXX 4377882333 pkg.txz 4377882341 Invalid configuration format, ignoring the configuration fi 4377882420 Consider changing PACKAGESITE or installing it from ports: 4377882498 REPOS_DIR 4377882508 asprintf 4377882517 Path to pkg.txz required 4377882543 %s/trusted 4377882556 A pre-built version of pkg could not be found for your syst -- 4466242378 pistrings 4466242388 pkg.conf 4466242397 pkg.txz 4466242410 plasma_saver 4466242423 plasma_saver.ko </code> The numbers are byte offsets into the .iso file. -n7 is the size of the string I was after; increase if hunting a longer string. Something to consider - in a general case, probably not yours - is that the desired string/s might be split over adjacent blocks, requiring some overlap of perhaps a few kb. cheers, Ian