Help fixing a bug; HP MicroServer N40L; CAM status: Command timeout
Polytropon
freebsd at edvax.de
Sun Jun 28 12:40:32 UTC 2015
On Sun, 28 Jun 2015 22:25:57 +1000, Yudi V wrote:
> Hi all,
>
> My system is a HP MicroServer N40L - more info at >
> n40l.wikia.com/wiki/HP_MicroServer_N40L_Wiki
>
> It has 4 internal HDD bays and an extra internal SATA port and an external
> eSATA port both of which run at 1.5Gbps SATA speed where as the internal 4
> bays run at 3Gbps speed.
>
> There are 4 HDD in this server and only the two that are connected to the
> 1.5Gbps SATA ports throw the below errors. This is present in v9.3 and
> v10.1 but not in v11 (generally scrubbing throws up a lot of these errors
> but not in v11). I want to use this as a file server so I dont want to use
> v11 until it's production ready.
>
>
> The system hangs every few mins and then the following errors get logged.
>
> ====================================
> ERROR from /var/log/messages
> =================================
>
> Jun 28 21:22:47 10p1test kernel: (ada3:ata0:0:1:0): READ_DMA. ACB: c8 00 88
> 00 41 44 00 00 00 00 01 00
> Jun 28 21:22:47 10p1test kernel: (ada3:ata0:0:1:0): CAM status: Command
> timeout
> Jun 28 21:22:47 10p1test kernel: (ada3:ata0:0:1:0): Retrying command
> Jun 28 21:23:21 10p1test kernel: (ada2:ata0:0:0:0): READ_DMA. ACB: c8 00 0d
> 30 c0 45 00 00 00 00 01 00
> Jun 28 21:23:21 10p1test kernel: (ada2:ata0:0:0:0): CAM status: Command
> timeout
> Jun 28 21:23:21 10p1test kernel: (ada2:ata0:0:0:0): Retrying command
> Jun 28 21:40:33 10p1test kernel: (ada2:ata0:0:0:0): READ_DMA. ACB: c8 00 51
> 30 70 45 00 00 00 00 01 00
> Jun 28 21:40:33 10p1test kernel: (ada2:ata0:0:0:0): CAM status: Command
> timeout
> Jun 28 21:40:33 10p1test kernel: (ada2:ata0:0:0:0): Retrying command
Have you already verified that there are no hardware errors
(bad cabling of the cages)?
> ==============================================
> I tried the suggestion from
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195349#c30
>
> added hint.ahci.0.msi="1" to /boot/loader.conf but did not fix the issue
This is only effective after a reboot. Have you done that?
Have you verified that you applied the setting to the correct
ahci device? Check "dmesg | grep ahci" to see if there are
more than one controller in the system.
> I even tried rebuilding the kernel but looks like it did not fix the issue.
> Well, this was my first time building a custom kernel, so I am not sure I
> got is right.
>
> the steps I followed were:
>
> I used the LINT config instead of creating my own,
>
> # svn checkout svn-mirror/base/head /usr/src
> # cd /usr/src/sys/amd64/conf && make LINT
> # cd /usr/src
> # make buildkernel KERNCONF=LINT
> # make installkernel KERNCONF=LINT
>
> did I get the process right?
Check the documentation here:
https://www.freebsd.org/doc/handbook/kernelconfig.html
A short overview is also present in the comment header of the
top level Makefile (/usr/src/Makefile); concentrate on the steps
that involve the kernel only.
> As this issue disappeared in v11, I am guessing it should be possible to
> fix it in v10 as well.
> Any suggestions/pointers on how to fix this bug would be greatly
> appreciated.
Building a custom kernel usually involves starting from a
copy of GENERIC (the default kernel), or including it and
making changes. However, just building and installing a
GENERIC kernel of the non-v11 version probably won't help.
The LINT kernel, on the other hand, contains all available
options and is being used mostly as a reference of how to
include things for a custom kernel.
In order to build a custom kernel, you need to know _which
difference_ it should implement compared to the GENERIC
kernel. "Just building one", as I mentioned, probably will
not be of great help.
--
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
More information about the freebsd-questions
mailing list