New alpha 5.x bug
Kris Kennaway
kris at obsecurity.org
Tue Nov 4 14:25:23 PST 2003
On Tue, Nov 04, 2003 at 11:12:51PM +0100, Bernd Walter wrote:
> On Tue, Nov 04, 2003 at 09:55:52AM -0800, Kris Kennaway wrote:
> > On Tue, Nov 04, 2003 at 01:48:27PM +0100, Bernd Walter wrote:
> > > I can't speak for this problem yet, because my test systems are a bit
> > > older, but speaking for the pipe corruption:
> > > I did a lots of bzip1, tar, scp, nfs(client) without noticing any
> > > sign of problem.
> > > What is so special with the port cluster?
> > > I have no clue about it's design.
> >
> > It does lots of parallel package builds (untar, pkg_add, compile, tar) and NFS copying.
>
> Any special NFS options?
> tcp, udp, v2, v3, IPv4, IPv6?
Here is a typical mount -v:
axp7# mount -v
216.136.204.23:/a/nfs/alpha/5.dir1 on / (nfs, read-only, fsid 00ff000404000000)
devfs on /dev (devfs, local, fsid 01ff000303000000)
/dev/md0c on /etc (ufs, local, writes: sync 732 async 64400, reads: sync 55109 async 8784, fsid 13d19a3f414b62cc)
/dev/md1c on /var (ufs, local, writes: sync 338 async 35923, reads: sync 22620 async 0, fsid 19d19a3f8ce9f280)
/dev/md2c on /tmp (ufs, local, writes: sync 12 async 20, reads: sync 13 async 0, fsid 1bd19a3fa96729ea)
/dev/da0e on /a (ufs, local, soft-updates, writes: sync 137415 async 13950800, reads: sync 12817244 async 903873, fsid f1d29a3f5b723dc6)
bento:/var/portbuild on /var/portbuild (nfs, fsid 02ff000404000000)
bento:/var/portbuild/alpha/5/ports on /a/tmp/5/chroot/24703/a/ports (nfs, read-only, fsid ebff120404000000)
bento:/var/portbuild/alpha/5/src on /a/tmp/5/chroot/24703/usr/src (nfs, read-only, fsid ecff120404000000)
bento:/var/portbuild/alpha/5/doc on /a/tmp/5/chroot/24703/usr/opt/doc (nfs, read-only, fsid edff120404000000)
devfs on /a/tmp/5/chroot/24703/dev (devfs, local, fsid eeff120303000000)
bento:/var/portbuild/alpha/5/ports on /a/tmp/5/chroot/25765/a/ports (nfs, read-only, fsid efff120404000000)
bento:/var/portbuild/alpha/5/src on /a/tmp/5/chroot/25765/usr/src (nfs, read-only, fsid f0ff120404000000)
bento:/var/portbuild/alpha/5/doc on /a/tmp/5/chroot/25765/usr/opt/doc (nfs, read-only, fsid f1ff120404000000)
devfs on /a/tmp/5/chroot/25765/dev (devfs, local, fsid f2ff120303000000)
The NFS mounts are nfsv3,intr,ro.
> Just to get the picture complete.
> The build is local and the package is then copied to a NFS server on
> which t has a corrupted CRC?
From my memory of tests I ran a few months ago, the bzip2 CRC is
corrupted when the package is created locally. The package is copied
to the server via scp.
> Is the bzip2 CRC wrong, or the tar CRC (does tar have a CRC?), or both?
Again from memory, the file is truncated, and there might be some
garbage (e.g. zeros) at the end.
> Can you say how likely such a corruption is?
On the last build 42 packages were corrupted out of about 7500.
> Are other packages compiled during copying a package file to the server?
Yes. Typically there are 5 builds running at a time on the client
machines.
> Are the building machines memory stressed while creating the bz file or
> while copying it?
The machines are definitely busy (building other packages) while the
package is created and copied, although the machines should not be
paging.
> Really - it's hard to believe that pipe itself is the problem.
> I do lots of buildworlds with CFLAGS=-pipe and a corruption would
> very likely stop building.
I know that the problem began between 5.1-R and August 6, but I have
not been able to track it down beyond this. There was work on both
pipes and VM in that time period, which is why I am suspicious of
both.
Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-alpha/attachments/20031104/fff38e7e/attachment.bin
More information about the freebsd-alpha
mailing list