Probably Hardware Trouble But What Is It?
Drew Tomlinson
drew at mykitchentable.net
Sun Dec 7 23:08:06 UTC 2014
On 12/7/2014 9:05 AM, Paul Pathiakis via freebsd-questions wrote:
> Drew,
>
> Just trying to assist....
>
> From the look of it, something is definitely failing and it is either
> the controller or the disk. FreeBSD is trying to stay alive. (I've
> had something similar happen in the past. When I rebooted, a disk
> showed to be faulted and inaccessible.)
>
> I'd theorize that the first line about the kernel maxfiles being
> exceeded by root (borrowing you haven't changed the setting) is due to
> the failure trying to allocate file handles to handle the requests
> that can't be completed due to the failure.
>
> If you have access to the console and another drive, you may want to
> connect a second drive, configure it to mirror the first and hope that
> it can mirror the first. If it works, great. BTW, don't forget to
> install bootblocks if this is your boot drive.
>
> Now, if it doesn't start to mirror the drive after being attached,
> you're going to have to reboot. That's probably going to show you the
> real failure. :-(
>
> If the controller card is onboard, not much you can do. If it's a
> PCIe bus card, try to re-seat it. Sometimes things get pulled on, or
> hit inadvertently and aren't sitting in the slot correctly any more.
>
> I agree with the other post in either replacing the connecting cables
> and/or re-seating them.
>
> If, after all this, it doesn't work, it's probably the disk itself.
>
> Now, comes the patient part. If it's the drive, it's probably pretty
> hot from failing and trying to do it's job. Don't laugh at this it's
> worked for me 5 out of 7 times. Remove it from the machine, let it
> cool to room temperature on anti-static bag. Once cool, put it in the
> bag, put it in your freezer for at least three hours. Re-insert into
> the machine. (At this point, you should have that other drive for the
> mirror connected.) If the drive isn't a catastrophic loss, it will
> work for a short time. I recommend you allow it to mirror. Ask the
> drive to do NOTHING but let it sit and mirror while in single-user mode.
>
> However, before going to that last 'iffy' part, check everything
> before that.
Thank you for your suggestions. Funny you mention the freezer trick. I
was just telling a co-worker about that as he's having trouble with a drive.
My problem was that because of the failing drive, I couldn't verify
which drive was causing the problem. Every time I'd try to issue a
zpool or zfs command, it would just hang. I actually have 4 drives
internally in the box and they are all together in a raidz1 pool and
this pool contains my full FBSD system. Then I have another drive in an
external SATA dock which I've put in it's own pool and mounted just to
use for backups. I disconnected this drive and rebooted. Now I can
access my system and have been able to verify that this is the failing
drive.
So I am lucky. All I have lost are backups. And thus all I need to do
is replace this drive and then resume my backups.
Thanks for your suggestions!
Cheers,
Drew
--
Like card tricks?
Visit The Alchemist's Warehouse to
learn card magic secrets for free!
http://alchemistswarehouse.com
>
>
> On 12/06/2014 19:58, Drew Tomlinson wrote:
>> I'm running FBS 9.1 RELEASE that I built several years ago. It's
>> mostly a Samba server and has "just worked" so I've never done much
>> more with it. However recently, I find it "locked up" with thousands
>> of these messages on the console:
>>
>> kernel: kern.maxfiles limit exceeded by uid 0, please see tuning(7)
>>
>> I've looked in /var/log/messages and also see lots of messages like
>> these:
>>
>> Dec 6 13:55:53 vm kernel: siisch0: ... waiting for slots 18000000
>> Dec 6 13:55:53 vm kernel: siisch0: Timeout on slot 28
>> Dec 6 13:55:53 vm kernel: siisch0: siis_timeout is 00040000 ss
>> 78000000 rs 78000000 es 00000000 sts 801b0000 serr 00000000
>> Dec 6 13:55:53 vm kernel: siisch0: ... waiting for slots 08000000
>> Dec 6 13:55:55 vm kernel: siisch0: Timeout on slot 27
>> Dec 6 13:55:55 vm kernel: siisch0: siis_timeout is 00040000 ss
>> 78000000 rs 78000000 es 00000000 sts 801b0000 serr 00000000
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): FLUSHCACHE48. ACB:
>> ea 00 00 00 00 40 00 00 00 00 00 00
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): CAM status: Command
>> timeout
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): Retrying command
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): READ_FPDMA_QUEUED.
>> ACB: 60 01 fe d8 74 40 39 00 00 00 00 00
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): CAM status: Command
>> timeout
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): Retrying command
>> Dec 6 13:55:55 vm kernel: (ada0:siisch0:0:0:0): READ_FPDMA_QUEUED.
>> ACB: 60 0a a5 7f 00 40 4c 00 00 00 00 00
>>
>> This machine uses zfs. I have two pools:
>>
>> # zpool list
>> NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
>> zback 1.81T 848G 1008G 45% 1.00x ONLINE -
>> zroot 1.81T 1.16T 666G 64% 1.00x ONLINE -
>>
>> Then I tried this and my ssh window is now stuck:
>>
>> # zpool status
>> pool: zback
>> state: ONLINE
>> status: One or more devices are faulted in response to IO failures.
>> action: Make sure the affected devices are connected, then run 'zpool
>> clear'.
>> see: http://illumos.org/msg/ZFS-8000-HC
>> scan: none requested
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> zback ONLINE 3 0 0
>> ada0 ONLINE 4 0 0
>>
>> I opened another ssh window and tried 'zpool clear zback' as
>> suggested but it appears stuck too.
>>
>> I'm sure I haven't provided all the relevant information so please
>> ask and I will do so. I'd appreciate any guidance on how to take a
>> proper backup of ada0 and what I should do next. I think this zback
>> pool is just the one disk which is a 2TB drive. I'd like to know how
>> to confirm that if possible since it seems the zpool commands aren't
>> able to complete.
>>
>> I appreciate any suggestions or guidance.
>>
>> Thanks,
>>
>> Drew
>>
>
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe at freebsd.org"
>
More information about the freebsd-questions
mailing list