E4500 with 24GB RAM
Pyun YongHyeon
yongari at rndsoft.co.kr
Sat Jun 11 07:54:22 GMT 2005
On Sat, Jun 11, 2005 at 03:36:41AM -0400, Kris Kennaway wrote:
> On Sat, Jun 11, 2005 at 04:26:32PM +0900, Pyun YongHyeon wrote:
> > On Sat, Jun 11, 2005 at 03:40:28PM +0900, Hiroki Sato wrote:
> > > Kris Kennaway <kris at obsecurity.org> wrote
> > > in <20050610211239.GA59402 at xor.obsecurity.org>:
> > >
> > > kr> I wonder if it's disk related. I tried to check out a ports tree on
> > > kr> this machine and it hung in a few seconds (although this was also
> > > kr> checking out using the network via nfs).
> > >
> > > I do not know why but the freeze occurs only when displaying "invalid
> > > packet size xxx; dropping". When I tried "vmstat 1" on the serial
> > > console and fetching a large file via ftp at the same time
> > > with 12GB RAM configuration, the freeze did not occur.
> > > Once "hme0: too may errors; not reporting any more" is displayed,
> > > the box seems to work fine and I can check out the ports tree via NFS
> > > without problems.
> > >
> >
> > Normally the "invalid packet size" message comes from link mismatch.
> > If your HME's PHY is DP83840 there are known issues on link neogotiation.
> > AFAIK the issue has nothing to do with panic as I always see that on my
> > Ultra2 which has DP83840 PHY too.
>
> This is on e450 and e4500 machines. I don't think there's a link
> mismatch.
>
> > I wonder how you can use NFS reliably on sparc64. Due to failure of
> > alignment(both server and client) it's really easy to get panic on sparc64.
>
> AFAICR I've never seen a problem with this (except with an i386 4.x
> server, which can be panicked by a sparc64 client)..I don't rely on
> NFS heavily in most cases, but I do use it on a number of machines
> (including two package build machines that netboot and access their
> ports trees over NFS, and have been in continuous operation with an
> uptime of 110 days).
>
Do you use NFS orver UDP? NFS over TCP has much better change of
getting panic.
If you copy a large file(> 100MB) from a NFS exported directory to
its sub-directory you probably hit a panic. If my memory serve right
there had been several NFS panic reports in current/sparc64 ML.
And I don't think it was fixed since the root cause of panic is in
nfsm_disct() and nfs_realign()(it was not touched for a long time.)
> > > I tried to comment out the HME_WHINE line of hme_read() in if_hme.c,
> > > and it seems to make the box work fine so far.
> >
> > HME_WHINE() just prints a message. I can't think removing the function
> > can cure your problem.
>
> It does seem to have helped though..before it would reliably lock up
> in seconds, and DDB break was non-responsive.
>
Hmm... Then it would be great to get a voluntary core dump when hme(4)
hits the condition(e.g. before processing HME_WHINE, invoke panic(9)).
> Kris
--
Regards,
Pyun YongHyeon
http://www.kr.freebsd.org/~yongari | yongari at freebsd.org
More information about the freebsd-sparc64
mailing list