Fwd: RFC: GEOM and hard disk LEDs
- Reply: Enji Cooper : "Re: RFC: GEOM and hard disk LEDs"
- Reply: Andrey Fesenko : "Re: RFC: GEOM and hard disk LEDs"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 07 Feb 2023 23:30:43 UTC
Most modern SES backplanes have two LEDs per hard disk. There's a "fault" LED and a "locate" LED. You can control either one with sesutil(8) or, with a little more work, sg_ses from sysutils/sg3_utils. They're very handy for tasks like replacing a failed disk, especially in large enclosures. However, there isn't any way to automatically control them. It would be very convenient if, for example, zfsd(8) could do it. Basically, it would just set the fault LED for any disk that has been kicked out of a ZFS pool, and clear it for any disk that is healthy or is being resilvered. But zfsd does not do that. Instead, users' only options are to write a custom daemon or to use sesutil by hand. Instead of forcing all of us to write our own custom daemons, why not train zfsd to do it? My proposal is to add boolean GEOM attributes for "fault" and "locate". A userspace program would be able to look up their values for any geom with DIOCGATTR. Setting them would require a new ioctl (DIOCSATTR?). The disk class would issue a ENCIOC_SETELMSTAT to actually change the LEDs whenever this attribute changes. GEOM transforms such as geli would simply pass the attribute through to lower layers. Many-to-one transforms like gmultipath would pass the attribute through to all lower layers. zfsd could then set all vdevs' fault attributes when it starts up, and adjust individual disk's as appropriate on an event-driven basis. Questions: * Are there any obvious flaws in this plan, any reasons why GEOM attributes can't be used this way? * For one-to-many transforms like gpart the correct behavior is less clear: what if a disk has two partitions in two different pools, and one of them is healthy but the other isn't? * Besides ZFS, are there any other systems that could take advantage? * SATA enclosures uses SGPIO instead of SES. SGPIO is too limited, IMHO, to be of almost any use at all. I suggest not even trying to make it work with this scheme. <Originally posted to freebsd-geom; reposting here for a larger audience>