Another slightly OT q...
illoai at gmail.com
illoai at gmail.com
Wed May 9 16:30:34 UTC 2007
On 09/05/07, Ted Mittelstaedt <tedm at toybox.placo.com> wrote:
>
>
> > -----Original Message-----
> > From: owner-freebsd-questions at freebsd.org
> > [mailto:owner-freebsd-questions at freebsd.org]On Behalf Of Gary Kline
> > Sent: Tuesday, May 08, 2007 7:19 PM
> > To: usleepless at gmail.com
> > Cc: Gary Kline; FreeBSD Mailing List
> > Subject: Re: Another slightly OT q...
> >
> >
> >
> > So it *was* a hoax? Rats. Some weeks ago on Public
> > Broadcasting, a few sentences were spoken on the potential of
> > fractal geometry to achieve [I'm guessing] data-compression on
> > the order of what Sloot was claiming. So far, no one has figured
> > it out. It may be a dream... .
> >
>
> There's some cool math out there that explains all of this but I never liked
> math, but it isn't necessary to know the math to understand the issue. Just
> consider the problem for a while and you will realize that the compression
> ratio of a specific data stream varies dependent on the amount of repetition
> in
> the input datastream. A perfectly unrandom datastream, like a constant
> series of logical 1's, carries no information, but has a compression ratio
> that is infinite. A perfectly random datastream, on the other hand,
> also carries no information, but has a compression ratio that is zero.
> I believe that a datastream that is 50% of the way between either extreme
> carries the most information, and I believe your typical datastream is much
> closer to
> the perfectly unrandom side than the perfectly random side, compression is
> merely the process of pushing the randomness of the stream closer to the
> random side.
Actually, the more information (as such) the closer
the data stream is to perfectly random. The relation-
ship might be asymptotic, but I am no maths major.
> Thus, if the input datastream is very close to the perfectly unrandom side -
> meaning it has a very high amount of repetition in it, you can get some
> pretty spectacular compression ratios. But as you move closer to unrandom,
> you carry less data. So, the better applications emit datastreams that
> are less unrandom, therefore compression does not work as well on them.
I suppose this leads to the discussion about what
"data" and "information" really are. Imagine a can.
The can is data. Imagine tha can is full of worms.
> This of course is completely ignoring the other data issue, is the
> application
> data efficient to begin with? For example, you can transfer about a page of
> information in ASCII that consumes about 1K of data, that same page of
> information in a MS Word file consumes a hundred times that amount of
> space -
> Word is therefore extremely inefficient with data.
In this case, since word "has to" replace typesetting,
layout, and formatting software, in addition to being a
word processor the header and meta information tend
to bloat the files quite a lot.
Every few years someone comes along who makes
some mad claims about some new buzzword-enhanced
compression technology. Obviously, if there is ever a
radical leap forward in that area the theory will have to
follow, since modern theory cannot accomodate (lossless)
compression past the point of randomness (generally less
than 16:1 even for Danielle Steele). mp3, avi, real media
mpeg, et al are a different story entirely, sicne they are
lossy and optimised for their respective information.
-rw-r--r-- 1 1705 1705 7826420 May 9 10:58
ssion_i_really_fuckin_care_about_you.rm
-rw-r--r-- 1 1705 1705 7791691 May 9 10:58
ssion_i_really_fuckin_care_about_you.rm.bz2
In this case, very slightly compressible: with some data
your resulting file will be slightly larger, yet the raw datastream
(and it looks like it was filmed from a cameraphone here (though
most likely an 8mm digicam (these, I believe, compress on the fly,
so the raw datastream never touches tape))) would probably have
been many tens, if not several hundreds, of megabytes.
Remember life before the tweel?
--
--
More information about the freebsd-questions
mailing list