LSI - MR-Fusion controller driver <mrsas> patch and man page

Desai, Kashyap Kashyap.Desai at lsi.com
Tue Mar 25 19:13:59 UTC 2014


Borja:

I have read your comment. First of all thanks for explaining with lots of technical details.  I definitely like to take this as feedback and will work internally to find out how best we can handle this. As of now you cannot use mrsas driver as mfi pass-through.

I have observed that most of the benefit you mentioned for pass-through is mainly faced by some manufacturing divisions and we provide temporary drop for some specific reason. That driver expose Un-configured drive to OS and they can do FW upgrade of the Drives without doing lots of manual work.

Let me explain you one fundamental problem with pass-through drive.

Let's say you have 4 drives and all are exposed to OS as pass-through drive. Now user can't recognize (using LSI provided configuration utils like storcli/MegaCl), if those drive are used by end user. From LSI config utils, it is still Un-configured drive and valid for creating Raid volume. So this is big issue managing physical disk if Un-configured drive are exposed and used by user.

You can run MR controller in JBOD mod where all drives will be default converted as JBOD and visible to the OS.  

Also, LSI controller support only T10 Thin provisioning standards. For all JBOD drives command will go to the actual drive, but for Volumes it disables via setting values in vpd page 0xb0 for Volumes.

<mrsas> controller map Volumes on bus-0 and syspd to bus-1..so you can easily figure out Raid vs JBOD. 

LSI developed CAM based HBA device driver  <mps> and that was under guidance of FreeBSD key folks.  Our first goal is to meet <mrsas> driver with all latest features (which Linux <megaraid_sas> driver supports)  and use CAM base interface same as <mps> driver. 
We will add new features as and when requested and prioritize. 

Doug:
I have to see your query regarding difference between Thunderbolt and Invader. 



~ Kashyap


> -----Original Message-----
> From: Borja Marcos [mailto:borjam at sarenet.es]
> Sent: Tuesday, March 25, 2014 8:01 PM
> To: Desai, Kashyap
> Cc: Doug Ambrisko; scottl at netflix.com; Radford, Adam; Kenneth D. Merry;
> sean_bruno at yahoo.com; Mankani, Krishnaraddi; dwhite at ixsystems.com;
> Maloy, Joe; jpaetzel at freebsd.org; freebsd-scsi at freebsd.org; McConnell,
> Stephen
> Subject: Re: LSI - MR-Fusion controller driver <mrsas> patch and man page
> 
> 
> On Mar 25, 2014, at 12:42 PM, Desai, Kashyap wrote:
> 
> > Borja:
> >
> > <mrsas> driver will attach Raid volume and JBOD (SysPD) to the CAM layer.
> It is not good to expose hidden raid volume or what we called as pass-
> through device here to the OS for many reason..  Other than management
> things like SMART monitor, we cannot/should not do file system IO on pass-
> through devices.
> 
> Of course it's not a good idea to expose drives that are part of a logical
> volume. But unconfigured drives should. Read on, please ;)
> 
> > With <mfi> it might be true that user always do file system IO on <mfiX>
> deivce and consider /dev/daX as pass-through device... With <mrsas> all
> device will be seen as <daX>. You cannot identify which will be a pass-
> through and which is configured device by LSI config utils.
> 
> Exposing devices as "da" should not be a mere "esthetic" decision. The "da"
> driver has some stuff intended for direct access to disks, but not for logical
> volumes created by other devices such as advanced RAID cards. For example,
> the "da" device can issue TRIM commands, it reads device serial numbers
> (which, now, can be used by GEOM to identify disks), etc. Disks are more
> complicated now with that "advanced format" thing and so I think it's very
> important for disks to be directly accessible if you want/need it.  Of course
> other features might be introduced in the future. Features that may be
> added to the "da" driver but which will probably be useless for a logical
> device, even outright inappropiate.
> 
> I would suggest you to offer choice, and, most critically, to offer a _clear_
> _choice_, as you have different kinds of customers. Some will want/need
> logical volumes and advanced RAID stuff, others won't. In some machines I
> have I am actually doing *both* things at the same time. I may have a RAID
> card based mirror for certain tasks, maybe with a UFS filesystem on it, relying
> on pass-through to the rest of the devices on which I use ZFS.
> 
> I think you should use a specific name for the logical devices, such as the mfi
> driver does. If I see a "mfid" device name it's clear that it's a logical device,
> not a "bare metal" hard disk, and that its behavior and features depend
> mainly on the logical device magic in the card.
> 
> And you should offer a perfectly transparent pass-through option, maybe
> restricted to disks not configured as "RAID" ones (to avoid accidents), I mean,
> what you now call "syspd" mode. These disks, ideally, should not be assigned
> to a special logic-volume like "mfisyspd" driver (or its equivalent), but to the
> "da" driver so that all of the features I expect from a bare metal hard disk
> would work. SMART, access to mode pages, detecting sector sizes, serial
> numbers, whatever, would work without hiccups.
> 
> Doing it the current "syspd" way means that any new feature added to disks
> must be added to the card firmware and to the "syspd" portion of the driver,
> while keeping a clear access to the SAS (or SATA-on-SAS) devices with no
> other manipulation would mean that the "da" driver would have immediate
> access to those features with no need to add support to the card firmware
> and driver.
> 
> 
> > It is not a complex code change if pass-through device is required for
> <mrsas>, but it is just a matter of no use and more error prone to expose
> devices as pass-through.
> 
> It is certainly error prone if you are using logical devices. But if you are not
> using them (my case and there are many others in this situation) the lack of a
> well supported pass through device can  be error prone.
> 
> From a mere engineering point of view, it's a bad idea to add unnecessary
> software layers. Advanced RAID card features are a lifesaver for "classic"
> filesystems such as UFS/FFS, EXTwhateverFS, NTFS, etc, but can get in the
> way of other filesystems such as ZFS. ZFS intends to perform the functions of
> a RAID device itself.
> 
> > None of the LSI driver does this including <mps> and <mrsas> in FreeBSD +
> <megaraid_sas> and <mpt2sas>/<mpt3sas> in Linux.
> 
> I've been using pass-through disks on Adaptec RAID cards (aac), and LSI Logic
> (mps and mfi) with different levels of success for years. It can be tricky, but
> ZFS works best with direct access to the disks.
> 
> > If you can express what functionality you think it is missing, if there is not
> pass-through device ?
> 
> Of course. Some of the missing functionalities I would miss by not using a
> pass through are:
> 
> - Inability to support problematic disks with "quirks". The "da" driver offers a
> flexible mechanism for that. If not using the da driver I lose that ability, and
> you will agree with me that getting a manufacturer (LSI) to update a cards
> firmware is much harder than doing it myself if needed.
> 
> - Inability to support future/special features without a firmware update for
> the card. An example is the diversity of block sizes in SSDs, or, more recently,
> TRIM for SSDs. ZFS on FreeBSD now supports TRIM, and it's important for
> performance and drive health. How does "syspd" handle it currently?
> 
> - Again I will insist on how additional software layers are a bad idea.
> 
> - Also, one of the "features" of LSI cards represents a serious operational
> issue: the persistent assignment of target numbers to disk serial numbers
> keeping a table of target-serial number mappings on NVRAM. There were
> some recent messages in this list regarding that problem. And it seems to
> happen even when using pass-through devices.
> 
> In the past I have had problems with ZFS and the "old" way of creating
> "pseudo JBOD" devices on LSI cards by creating a RAID 0 logical volume for
> each disk. For example, hot swapping a broken disk can be more error prone
> if,  apart from just extracting a disk and adding a new one, I need to run
> certain tool to have it effectively recognised by the card firmware. It adds
> unnecessary complexity. Moreover, in some cases (I can't recall the exact
> details, as it happened several years ago) it requires a reboot, which defeats
> the purpose of how swappable disks in the first place.
> 
> Please don't underestimate the operational impact of all this. An operator
> swapping a disk at 3 am should not need to do any complex check to
> determine the disk to extract. Nor he/she should require additional actions
> such as "mfiutil online this", activate that or, of course, a reboot, to have it
> recognised. ZFS (and, I presume, other advanced filesystems) has its own
> commands for that, which include their own sanity checks doing its best to
> avoid trouble.
> 
> > Are you doing ZFS (File system IO) on Pass-through device. ?
> 
> Indeed I am. And I know there are many successful setups doing the same.
> 
> > If yes, then why can't you create JBOD/SysPD  for that purpose?
> 
> It's explained above but I will summarize.
> 
> - Plain simple good engineering practice (avoiding unneeded software
> layers),
> - Access to special/future features on disks
> - Better observability (monitoring, etc)
> - Simpler operational procedures which means safer systems operations and
> better reliability.
> 
> Let me be brutally honest here and, please, take no offense but take it as
> feedback from a customer. Right now, advanced RAID cards can be more a
> liability than a desirable feature. Look at all the places where people
> repurpose RAID cards to be simple HBAs doing all sorts of unsupported
> voodoo.
> 
> Ideally this shouldn't happen, but we are somewhat forced by server
> manufacturers. At some point at least, for example, Dell refused to sell "IT
> mode" LSI2008 cards for internal devices, selling them just with external SAS
> connectors. So many people just repurpose the internal, "IR firmware" cards
> to "IT mode" so that they can be simple HBAs even though they still pose a
> problem with that target-serial number feature in NVRAM. I have an IBM
> server here with an onboard Invader card which, obviously, has many more
> features.
> 
> By defining some design guidelines for your hardware, firmware, and drives,
> however, you can get to a win-win solution. If a card can fullfill both roles
> perfectly (advanced RAID features and plain HBA) it will no longer be a
> liability. The same hardware will be appropiate for many purposes, and it will
> be even better for the purchasing departments of us, your final customers.
> No need to be keeping track of several SKUs  depending on the intended
> purpose. Same card usable for, say, NTFS and ZFS depending only on
> configuration.
> 
> And those design guidelines I am suggesting are simple:
> 
> - Full functioning pass through mode with a minimal surprise component,
> with the simplest, most transparent possible access from the CAM layer to
> the SAS/SATA commands so that those true pass-through devices get
> assigned to the right drivers such as "ses",  "da", "sa", etc. This should be a
> core feature, not an add on to somewhat ease monitoring.
> 
> - Making that transparent, pass through mode clearly distinguishable from
> the logical volume magic, so that the device name reflects its nature and
> purpose. "mfid" (or "mrsasd", or whatever you like) would the logical
> devices, avoiding attaching them to the standard CAM drivers.
> 
> 
> You could just repurpose the "syspd" configuration in the newer
> cards/firmware versions so that drives marked as "syspd" become perfectly
> transparent pass throughs.
> 
> Please consider it, I am sure you will have many happy customers.
> 
> (And I hope you endured reading this message until the end!!)
> 
> 
> Thank you!
> 
> 
> 
> 
> 
> 
> 
> Borja.
> 



More information about the freebsd-scsi mailing list