Re: BSD-awk print() Behavior
- Reply: Sysadmin Lists : "Re: BSD-awk print() Behavior"
- In reply to: Sysadmin Lists : "BSD-awk print() Behavior"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 21 Feb 2023 10:14:21 UTC
On Tue, Feb 21, 2023 at 01:24:41AM +0100, Sysadmin Lists wrote: > Trying to wrap my head around what BSD awk is doing here. Although the behavior > is unwanted for this exercise, it seems like a possibly useful feature or hack > for future projects. Either way I'd like to understand what's going on. > > I extracted a list of URLs from my browser's history sql file, and when > iterating over the list with awk got some strange results. > > file_1 has the sql-extracted URLs, and file_2 is a copy-paste of that file's > contents using vim's yank-and-paste. > > $ cat file_{1,2} > https://github.com/ > https://github.com/ > https://github.com/ > https://github.com/ > > $ diff file_{1,2} > 1,2c1,2 > < https://github.com/ > < https://github.com/ > --- > > https://github.com/ > > https://github.com/ > > $ awk '{ print $0 " abc " }' file_{1,2} > abc ://github.com/ > abc ://github.com/ > https://github.com/ abc > https://github.com/ abc file_1 is a DOS text file, while file_2 is a Unix text file. The DOS text file, when interpreted by tools expecting Unix text, has an extra carriage-return character at the end of each line. This carriage-return character will be part of $0 in the awk code and causes the cursor to be moved back to the start of the line when printing it, giving the effect that you are seeing. This has nothing to do with awk's print keyword. You would get similar strange result if you simply pasted the data side by side: $ paste file_{1,2} https://https://github.com/ https://https://github.com/ Here, "https://github.com/" is first printed from the DOS text file, after which the cursor is returned to the start of the line. Then, paste inserts a tab character which "steps over" the eight first characters that had already been outputted ("https://") and then outputs "https://github.com/" from the Unix text file. > > The sql-extracted URLs cause awk's print() to replace the front of the string > with text following $0. file_2 does not. I used vim's `:set list' option to > view hidden chars, but there's no apparent difference between the two -- > although `diff' clearly thinks so. Both files show this when `list' is set: > > https://github.com/$ > https://github.com/$ Yes, because Vim automatically interprets DOS text files as ordinary text. I'm asssuming that while editing file_1 in Vim, you see "[dos]" at the bottom of the screen? > > > Here's more background if needed: [cut] -- Andreas (Kusalananda) Kähäri SciLifeLab, NBIS, ICM Uppsala University, Sweden .