numeric sort(1) is broken on -STABLE

Ruslan Ermilov ru at freebsd.org
Thu Feb 11 16:09:43 UTC 2010


On Thu, Feb 11, 2010 at 08:40:51AM +0100, Ulrich Spörlein wrote:
> On Wed, 10.02.2010 at 15:00:07 -0600, Dan Nelson wrote:
> > In the last episode (Feb 10), Ulrich Spörlein said:
> > > On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote:
> > > > On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
> > > > > not sure if this is a pilot error, but it seems to me that gnu sort -n
> > > > > is broken on at least -STABLE (couldn't test -CURRENT yet).
> > > > > 
> > > > > It somehow does not manifest when using a simple list and sorting on a
> > > > > specific column, but it always happens to me when using it in
> > > > > combination with find(1).
> > > > > 
> > > > > % truncate -s10m a; truncate -s5m b; truncate -s800k c
> > > > > % find a b c -ls|sort -nk7,7
> > > > >      8       64 -rw-r--r--    1 uqs              wheel            10485760 Feb 10 09:13 a
> > > > >     10       64 -rw-r--r--    1 uqs              wheel             5242880 Feb 10 09:13 b
> > > > >     12       64 -rw-r--r--    1 uqs              wheel              819200 Feb 10 09:13 c
> > > > 
> > > > I bet you're using some non-C locale for LC_NUMERIC.  What does "locale"
> > > > output tell you?
> > > 
> > > Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as
> > > there are no non-ASCII symbols in that output it shouldn't matter, right? 
> > > For me, 819200 is smaller than 10485760 in pretty much all locales.  Why
> > > the hell is a numeric gnusort locale dependant?  Why is -g working anyway?
> > 
> > Try adding a 'b' to your sort flags.  I bet the leading spaces in front of
> > your numbers are being treated as part of the sort key.  Maybe de_DE.UTF-8
> > and C have different ideas of what is whitespace?
> 
> Indeed, 'b' is working too. So I've stocked up on the number of
> workarounds for this problem. What amazes me, is that no one seems to be
> as shocked as I to find out something basic like sorting on a number is
> not DTRT.

It is a long standing issue with Russian locales as well, but there
the problem manifests itself only with LC_NUMERIC, not LC_CTYPE.


Cheers,
-- 
Ruslan Ermilov
ru at FreeBSD.org
FreeBSD committer


More information about the freebsd-stable mailing list