AFACLI failover success (was: non-responding PERC with aaccli ...)
Dennis G Allard
allard at oceanpark.com
Mon Sep 27 09:20:58 PDT 2004
Thanks for everyone who replied.
The original procedure for using AFACLI to activate a HOTSPARE that I
outlined worked except, after performing `container set global_failover
(0,3,0), I had to do `controller rescan` before the failover kicked in.
This in spite of the fact that `controller show automatic_failover` was
ENABLED.
Since I do not subscribe to this mailing list (or any mailing list,
preferring IETF standard newsgroup culture), I am sending this summary
as a stand-alone post. (It would be good if someone were to post a
follow up to the original thread and include the following text to
complete that thread, thanks)...
DETAILS:
(1 of 4) ORIGINAL STATE (pre-failover):
AFA0> enclosure show slot
> Executing: enclosure show slot
>
> Enclosure
> ID (B:ID:L) Slot scsiId Insert Status
> ----------- ---- ------ -------
------------------------------------------
> 0 0:06:0 0 0:00:0 1 OK FAILED CRITICAL ACTIVATE
> 0 0:06:0 1 0:01:0 1 OK FAILED CRITICAL ACTIVATE
> 0 0:06:0 2 0:02:0 1 ERROR FAULTY FAILED CRITICAL ACTIVATE
> 0 0:06:0 3 0:03:0 1 OK UNCONFIG HOTSPARE ACTIVATE
>
> AFA0> disk list
> Executing: disk list
>
> B:ID:L Device Type Blocks Bytes/Block Usage Shared
> ------ -------------- --------- ----------- ---------------- ------
> 0:00:0 Disk 71132959 512 Initialized NO
> 0:01:0 Disk 71132960 512 Initialized NO
> 0:02:0 Disk 0 0 Offline NO
> 0:03:0 Disk 71132960 512 Initialized NO
>
> AFA0> container list
> Executing: container list
> Num Total Oth Chunk Scsi Partition
> Label Type Size Ctr Size Usage B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
> 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB
> /dev/sda SEPT 0:01:0 64.0KB:33.8GB
> 0:02:0 64.0KB!33.8GB
>
>
> AFA0>
(2 of 4) ACTIONS TAKEN:
container set global_failover (0,3,0)
controller rescan
Excerpts from acutal session:
Note: [[my comments are in double square brackets like these]]
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
>
> No tasks currently running on controller
>
> AFA0> container set global_failover (0,3,0)
> Executing: container set global_failover (BUS=0,ID=3,LUN=0)
>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
>
> No tasks currently running on controller [[Hmmm - why not?]]
>
> AFA0>
> AFA0>
> AFA0> disk show space
> Executing: disk show space
>
> Scsi B:ID:L Usage Size
> ----------- ---------- -------------
> 0:00:0 Container 64.0KB:33.8GB
> 0:00:0 Free 33.8GB:59.0KB
> 0:01:0 Container 64.0KB:33.8GB
> 0:01:0 Free 33.8GB:59.0KB
> 0:02:0 Dead 64.0KB:33.8GB
> 0:02:0 Free 33.8GB:59.0KB
> 0:03:0 Free 64.0KB:33.8GB
>
> AFA0>
> AFA0>
> AFA0> container list
> Executing: container list
> Num Total Oth Chunk Scsi Partition
> Label Type Size Ctr Size Usage B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
> 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB
> /dev/sda SEPT 0:01:0 64.0KB:33.8GB
> 0:02:0 64.0KB!33.8GB
>
>
> AFA0> enclosure show slot
> Executing: enclosure show slot
>
> Enclosure
> ID (B:ID:L) Slot scsiId Insert Status
> ----------- ---- ------ ------- ------------------------------------------
> 0 0:06:0 0 0:00:0 1 OK FAILED CRITICAL ACTIVATE
> 0 0:06:0 1 0:01:0 1 OK FAILED CRITICAL ACTIVATE
> 0 0:06:0 2 0:02:0 1 ERROR FAULTY FAILED CRITICAL ACTIVATE
> 0 0:06:0 3 0:03:0 1 OK UNCONFIG HOTSPARE ACTIVATE
>
> AFA0>
> AFA0>
> AFA0>
> AFA0> container show failover
> Executing: container show failover
>
> Container Scsi B:ID:L
> --------- ----------------------------------
> GLOBAL 0:03:0
> 0 --- No Devices Assigned ---
>
> AFA0>
> AFA0>
> AFA0>
> AFA0> controller show automatic_failover
> Executing: controller show automatic_failover
> Automatic failover ENABLED [[Well????]]
>
> AFA0>
> AFA0>
> AFA0> [[I guessed to try...]]
> AFA0> controller rescan
> Executing: controller rescan
>
> AFA0>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
> 101 Rebuild 0.1% 00 RUN 00000000 00000000
>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
> 101 Rebuild 0.1% 00 RUN 00000000 00000000 [[Much Better!!!]]
>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
> 101 Rebuild 0.2% 00 RUN 00000000 00000000
>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
> 101 Rebuild 0.6% 00 RUN 00000000 00000000
>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
> 101 Rebuild 0.6% 00 RUN 00000000 00000000
>
> AFA0> task list
> Executing: task list
>
> Controller Tasks
>
> TaskId Function Done% Container State Specific1 Specific2
> ------ -------- ------- --------- ----- --------- ---------
> 101 Rebuild 1.3% 00 RUN 00000000 00000000
>
> AFA0>
> AFA0>
> AFA0>
> AFA0> enclosure show slot
> Executing: enclosure show slot
>
> Enclosure
> ID (B:ID:L) Slot scsiId Insert Status [[note the REBUILD]]
> ----------- ---- ------ ------- ------------------------------------------
> 0 0:06:0 0 0:00:0 1 OK REBUILD FAILED CRITICAL ACTIVATE
> 0 0:06:0 1 0:01:0 1 OK REBUILD FAILED CRITICAL ACTIVATE
> 0 0:06:0 2 0:02:0 1 OK FAILED CRITICAL UNCONFIG ACTIVATE
> 0 0:06:0 3 0:03:0 1 OK REBUILD FAILED CRITICAL HOTSPARE
> ACTIVATE
>
> AFA0>
> AFA0>
> AFA0>
> AFA0> container list
> Executing: container list
> Num Total Oth Chunk Scsi Partition
> Label Type Size Ctr Size Usage B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
> 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB
> /dev/sda SEPT 0:01:0 64.0KB:33.8GB
> 0:03:0 64.0KB:33.8GB
>
>
> AFA0>
> AFA0>
> AFA0> disk list
> Executing: disk list
>
> B:ID:L Device Type Blocks Bytes/Block Usage Shared
> ------ -------------- --------- ----------- ---------------- ------
> 0:00:0 Disk 71132959 512 Initialized NO
> 0:01:0 Disk 71132960 512 Initialized NO
> 0:02:0 Disk 0 0 Offline NO
> 0:03:0 Disk 71132960 512 Initialized NO
>
> AFA0>
> AFA0>
> AFA0> disk show space
> Executing: disk show space
>
> Scsi B:ID:L Usage Size
> ----------- ---------- -------------
> 0:00:0 Container 64.0KB:33.8GB
> 0:00:0 Free 33.8GB:59.0KB
> 0:01:0 Container 64.0KB:33.8GB
> 0:01:0 Free 33.8GB:59.0KB
> 0:03:0 64.0KB:33.8GB [[0:02:0 is gone -- good]]
> 0:03:0 Free 33.8GB:59.0KB
>
> AFA0>
> AFA0>
> AFA0> [[ultimately, the REBUILD took ~2.5 hours]]
(3 of 4) FINAL STATE (post-failover):
> afacli
> ---------------------------------------------------------------------------------------------------------------------------------------------
> DELL PowerEdge Expandable RAID Controller 2 Command Line Interface
> Copyright 1998-2000 Adaptec, Inc. All rights reserved
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
> FASTCMD> open afa0
> Executing: open "afa0"
>
> AFA0> enclosure show slot
> Executing: enclosure show slot
>
> Enclosure
> ID (B:ID:L) Slot scsiId Insert Status
> ----------- ---- ------ ------- ------------------------------------------
> 0 0:06:0 0 0:00:0 1 OK ACTIVATE
> 0 0:06:0 1 0:01:0 1 OK ACTIVATE
> 0 0:06:0 2 0:02:0 1 OK FAILED CRITICAL UNCONFIG ACTIVATE
> 0 0:06:0 3 0:03:0 1 OK HOTSPARE ACTIVATE [[why still see 'HOTSPARE'?]]
>
> AFA0>
> AFA0>
> AFA0> disk list
> Executing: disk list
>
> B:ID:L Device Type Blocks Bytes/Block Usage Shared
> ------ -------------- --------- ----------- ---------------- ------
> 0:00:0 Disk 71132959 512 Initialized NO
> 0:01:0 Disk 71132960 512 Initialized NO
> 0:02:0 Disk 0 0 Offline NO
> 0:03:0 Disk 71132960 512 Initialized NO
>
> AFA0>
> AFA0>
> AFA0> container list
> Executing: container list
> Num Total Oth Chunk Scsi Partition
> Label Type Size Ctr Size Usage B:ID:L Offset:Size
> ----- ------ ------ --- ------ ------- ------ -------------
> 0 RAID-5 67.7GB 32KB Open 0:00:0 64.0KB:33.8GB
> /dev/sda SEPT 0:01:0 64.0KB:33.8GB
> 0:03:0 64.0KB:33.8GB
>
>
> AFA0>
> AFA0>
> AFA0> disk show space
> Executing: disk show space
>
> Scsi B:ID:L Usage Size
> ----------- ---------- -------------
> 0:00:0 Container 64.0KB:33.8GB
> 0:00:0 Free 33.8GB:59.0KB
> 0:01:0 Container 64.0KB:33.8GB
> 0:01:0 Free 33.8GB:59.0KB
> 0:03:0 Container 64.0KB:33.8GB
> 0:03:0 Free 33.8GB:59.0KB
>
> AFA0>
> AFA0>
> AFA0> container show failover
> Executing: container show failover
>
> Container Scsi B:ID:L
> --------- ----------------------------------
> GLOBAL 0:03:0
> 0 --- No Devices Assigned ---
>
> AFA0>
> AFA0>
> AFA0> controller show automatic_failover
> Executing: controller show automatic_failover
> Automatic failover ENABLED
>
> AFA0>
> AFA0>
> AFA0>
(4 of 4) REMAINING QUESTIONS
A. Given that `controller show automatic_failover` = ENABLED both before
and after, why was it necessary for me to issue a `controller rescan` in
order to make the rebuild task kick in?
B. Why does `enclosure show slot` still list drive (0,3,0) as having
state label 'HOTSPARE'?
C. How do I get rid of the (0,2,0) drive? What I am going to try is:
enclosure prepoare slot 0 2
<physically remove the drive>
-end of post-
Cheers,
Dennis
--
Dennis G. Allard telephone: 1.310.399.4740
Ocean Park Software http://oceanpark.com
________________________________________________________________________
More information about the freebsd-scsi
mailing list