script to assist ASCII text
Karl Vogel
vogelke+software at pobox.com
Thu Sep 4 06:27:47 UTC 2008
>> On Mon, 25 Aug 2008 21:00:10 -0700,
>> Gary Kline <kline at thought.org> said:
G> This had eluded me for years and it may not be possible, but here goes.
G> I write using vi or, less frequently vim. Is there any sh script that
G> would make sure that there were exactly one space ('\040') between words,
G> and three spaces between sentences? My definition of "a sentence" is a
G> string of words that ends in a period or question-mark, exclamation-mark,
G> or ellipse ("... . || ... ? || ... !) Also, any dash "--" could not have
G> any whitespace around it.
I like a similar setup -- one space between words, sentences ending
with a period followed by two spaces. The GNU version of "fmt" handles
this pretty well. Here's the first part of your message, formatted to
50-character-wide lines, with the type of spacing that drives me nuts:
me% cat -n msg
1 This had eluded me for years and it may not be
2 possible, but here goes. I write using vi or,
3 less frequently vim. Is there any sh script that
4 would make sure that there were exactly one
5 space ('\040') between words, and three spaces
6 between sentences? My definition of "a sentence"
7 is a string of words that ends in a period or
8 question-mark, exclamation-mark, or ellipse.
Putting one word on each line and then letting GNU fmt decide on
sentence-handling does almost exactly what you want:
me% gfmt -1 msg | gfmt -50 | cat -n
1 This had eluded me for years and it may not be
2 possible, but here goes. I write using vi or,
3 less frequently vim. Is there any sh script
4 that would make sure that there were exactly one
5 space ('\040') between words, and three spaces
6 between sentences? My definition of "a sentence"
7 is a string of words that ends in a period or
8 question-mark, exclamation-mark, or ellipse.
Here's a script I use as a driver for GNU fmt. It looks for an
optional environment variable FMTWIDTH to decide how long each line
should be. This comes in handy if I call vi/vim from within a script:
#!/bin/sh
# driver for fmt.
case "$FMTWIDTH" in
"") opt= ;;
*) opt="-$FMTWIDTH" ;;
esac
case "$1" in
-*) opt= ;;
*) ;;
esac
exec /usr/local/bin/gfmt $opt ${1+"$@"}
Here's an alias I use for quickly reformatting a section of text
in vim. I mark where to start using 'a', then move down to the end
of the section and hit 'v':
jmbk:'a,.!fmt -1|fmt<CR>'b
A similar alias will reformat whatever paragraph I'm in, with no need
for marks:
}jmbk{ma}:'a,.!fmt -1|fmt<CR>'b
The script below helps me clean up a file or message after running fmt,
which makes strings like "U.S.A." look like the end of a sentence
even when they're not. This should give you some ideas.
--
Karl Vogel I don't speak for the USAF or my company
Panda Mating Fails; Veterinarian Takes Over --actual news headline, 1997
---------------------------------------------------------------------------
#!/usr/bin/perl
#
# $Id: cm,v 1.3 2008/08/17 20:25:49 vogelke Exp $
# $Source: /home/vogelke/bin/RCS/cm,v $
#
# cm: clean mail message
while (<>) {
s/Jan\. /Jan /g;
s/Feb\. /Feb /g;
s/Aug\. /Aug /g;
s/Sept\. /Sept /g;
s/Oct\. /Oct /g;
s/Nov\. /Nov /g;
s/Dec\. /Dec /g;
s/Mr\. /Mr. /g;
s/Mrs\. /Mrs. /g;
s/Ms\. /Ms. /g;
s/Dr\. /Dr. /g;
s/Sen\. /Senator /g;
s/Rep\. /Representative /g;
s/U\.S\.A\. /USA /g;
s/U\.S\. /US /g;
s/D\.C\. /DC /g;
s/U\.N\. /UN /g;
s/B\.S\. /BS /g;
s/M\.B\.A\. /MBA /g;
s/ ([A-Z]\.) / $1 /g;
s/''/\"/g;
s/``/\"/g;
s/\342\200\231/'/g; # These come from saving Firefox pages
s/\342\200\234/"/g;
s/\342\200\235/"/g;
print;
}
exit(0);
More information about the freebsd-questions
mailing list