BSD-awk print() Behavior
- Reply: jin guojun : "Re: BSD-awk print() Behavior"
- Reply: Andreas Kusalananda Kähäri : "Re: BSD-awk print() Behavior"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 21 Feb 2023 00:24:41 UTC
Trying to wrap my head around what BSD awk is doing here. Although the behavior is unwanted for this exercise, it seems like a possibly useful feature or hack for future projects. Either way I'd like to understand what's going on. I extracted a list of URLs from my browser's history sql file, and when iterating over the list with awk got some strange results. file_1 has the sql-extracted URLs, and file_2 is a copy-paste of that file's contents using vim's yank-and-paste. $ cat file_{1,2} https://github.com/ https://github.com/ https://github.com/ https://github.com/ $ diff file_{1,2} 1,2c1,2 < https://github.com/ < https://github.com/ --- > https://github.com/ > https://github.com/ $ awk '{ print $0 " abc " }' file_{1,2} abc ://github.com/ abc ://github.com/ https://github.com/ abc https://github.com/ abc The sql-extracted URLs cause awk's print() to replace the front of the string with text following $0. file_2 does not. I used vim's `:set list' option to view hidden chars, but there's no apparent difference between the two -- although `diff' clearly thinks so. Both files show this when `list' is set: https://github.com/$ https://github.com/$ Here's more background if needed: I extracted the URLs using sqlite3 like so: for f in History-16768665* do sqlite3 --bail $f <<-HEREDOC .mode csv .output ${f}.csv select * from urls where url like '%github%'; HEREDOC done Then tried to create a list of unique URLs using `sort -u' but it broke because of special chars in the extracted lines (so it claimed). I used awk to get a unique list instead: for f in *.csv; do [[ -s $f ]] && list="${list} $f"; done; echo $list awk '{ u[$0] } END { for (e in u) print e > "file_1" }' $list -- Sent with https://mailfence.com Secure and private email