awk programming question
RW
rwmaillists at googlemail.com
Thu Jan 23 18:56:08 UTC 2014
On Thu, 23 Jan 2014 09:30:35 -0700 (MST)
Warren Block wrote:
> On Thu, 23 Jan 2014, Paul Schmehl wrote:
>
> > I'm kind of stubborn. There's lots of different ways to skin a
> > cat, but I like to force myself to use the built-in utilities to do
> > things so I can learn more about them and better understand how
> > they work.
> >
> > So, I'm trying to parse a file of snort rules, extract two string
> > values and insert a double pipe between them to create a
> > sig-msg.map file
> >
> > Here's a typical rule:
> >
> > alert udp $HOME_NET any -> $EXTERNAL_NET 69 (msg:"E3[rb] ET POLICY
> > Outbound TFTP Read Request"; content:"|00 01|"; depth:2;
> > classtype:bad-unknown; sid:2008120; rev:1;)
> >
> > Here's a typical sig-msg.map file entry:
> >
> > 9624 || RPC UNIX authentication machinename string overflow attempt
> > UDP
> >
> > So, from the above rule I would want to create a single line like
> > this:
> >
> > 2008120 || E3[rb] ET POLICY Outbound TFTP Read Request
> >
> > There are several ways I can extract one or the other value, and
> > I've figured out how to extract the sid and add the double pipe,
> > but for the life of me I can't figure out how to extract and print
> > out sid || msg.
> >
> > This prints out the sid and the double pipe:
> >
> > echo `awk 'match($0,/sid:[0-9]*;/) {print
> > substr($0,RSTART,RLENGTH)" || "}' /tmp/mtc.rules | tr -d ";sid"
> >
> > It seems I could put the results into a variable rather than
> > printing them out, and then print var1 || var2, but my google foo
> > hasn't found a useful example.
> >
> > Surely there's a way to do this using awk? I can use tr for
> > cleanup. I just need to get close to the right result.
> >
> > How about it awk experts? What's the cleanest way to get this done?
>
> Not an awk expert, but you can do math on the start and length
> variables to get just the date part:
>
> echo "sid:2008120;" \
> | awk '{ match($0, /sid:[0-9]*;/) ; \
> ymd=substr($0, RSTART+4, RLENGTH-5) ; print ymd }'
>
> Closer to what you want:
>
> echo 'msg:"E3[rb] ET POLICY Outbound TFTP Read Request";
> sid:2008120;' \ | awk '{ match($0, /sid:[0-9]*;/) ; \
> ymd=substr($0, RSTART+4, RLENGTH-5) ; \
> match($0, /msg:.*;/) ; \
> msg = substr($0, RSTART+4, RLENGTH-5) ; \
> print ymd, "||", msg }'
>
> Note the error that the too-greedy regex creates, and the inability
> of awk to capture regex sub-expressions. awk does not have a way to
> reduce the greediness, at least that I'm aware. You may be able to
> work around that, like if the message is always the same length.
$ echo 'msg:"E3[rb] ET POLICY Outbound TFTP Read Request"; sid:2008120;' |\
awk '{ match($0, /sid:[0-9]+;/) ; ymd=substr($0, RSTART+4, RLENGTH-5) ; \
match($0, /msg:[^;]+;/) ; msg = substr($0, RSTART+4, RLENGTH-5) ; \
print ymd, "||", msg }'
2008120 || "E3[rb] ET POLICY Outbound TFTP Read Request"
Note that awk supports +, but not newfangled things like *.
More information about the freebsd-questions
mailing list