Is there a database built into the base system
Ernie Luzar
luzar722 at gmail.com
Sat Apr 8 17:35:12 UTC 2017
Polytropon wrote:
> On Sat, 08 Apr 2017 13:00:15 -0400, Ernie Luzar wrote:
>> Here is my first try at using awk to Read every record in the input
>> file and drop duplicates records from output file.
>>
>>
>> This what the data looks like.
>> /etc >cat /ip.org.sorted
>> 1.121.136.228;
>> 1.186.172.200;
>> 1.186.172.210;
>> 1.186.172.218;
>> 1.186.172.218;
>> 1.186.172.218;
>> 1.34.169.204;
>> 101.109.155.81;
>> 101.109.155.81;
>> 101.109.155.81;
>> 101.109.155.81;
>> 104.121.89.129;
>
> Why not simply use "sort | uniq" to eliminate duplicates?
>
>
>
>> /etc >cat /root/bin/ipf.table.awk.dup
>> #! /bin/sh
>>
>> file_in="/ip.org.sorted"
>> file_out="/ip.no-dups"
>>
>> awk '{ in_ip = $1 }'
>> END { (if in_ip = prev_ip)
>> next
>> else
>> prev_ip > $file_out
>> prev_ip = in_ip
>> } $file_in
>>
>> When I run this script it just hangs there. I have to ctrl/c to break
>> out of it. What is wrong with my awk command?
>
> For each line, you store the 1st field (in this case, the entire
> line) in in_ip, and you overwrite (!) that variable with each new
> line. At the end of the file (!!!) you make a comparison and even
> request the next data line. Additionally, keep an eye on the quotes
> you use: '...' will keep the $ in $file_out, that's now a variable
> inside awk which is empty. The '...' close before END, so outside
> of awk. Remember that awk reads from standard input, so your
> redirection for the input file would need to be "< $file_in",
> or useless use of cat, "cat $file_in | awk > $file_out".
>
> In your specific case, I'd say not that awk is the wrong tool.
> If you simply want to eliminate duplicates, use the classic
> UNIX approach "sort | uniq". Both tools are part of the OS.
>
The awk script I posted is a learning tool. I know about "sort | uniq"
I though "end" was end of line not end of file. So how should that awk
command look to drop dups from the out put file?
More information about the freebsd-questions
mailing list