[Bug 222066] mpt crash in virtualbox

Tue Sep 5 12:28:43 UTC 2017

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222066

            Bug ID: 222066
           Summary: mpt crash in virtualbox
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: avg at FreeBSD.org

We are seeing a kernel crash in mpt driver while running FreeBSD as a guest in
Virtualbox.

The problem seems to be caused by Virtualbox setting the request frame size
parameter to 512 bytes:

pReply->IOCFacts.u16RequestFrameSize  = 128;    /* @todo Figure out where it is
needed. */

The driver does not seem to be able to cope with such a large frame size.
It could be argued that the bug is on the Virtualbox side.  I am not sure if it
really needs such a size (especially, given the comment).

Anyway, it could be useful to be able to handle that value in mpt.

A bit of code analysis follows.
In the code we have the following important definitions:

/* MPT_RQSL- size of request frame, in bytes */
#define MPT_RQSL(mpt)           (mpt->ioc_facts.RequestFrameSize << 2)

#define MPT_MAX_REQUESTS(mpt)   512
#define MPT_REQUEST_AREA        512
#define MPT_SENSE_SIZE          32      /* included in MPT_REQUEST_AREA */
#define MPT_REQ_MEM_SIZE(mpt)   (MPT_MAX_REQUESTS(mpt) * MPT_REQUEST_AREA)

So, the code allocates 512 request buffers of 512 bytes each as a single
contiguous (both physically and virtually) buffer suitable for DMA between the
driver and the hardware (see mpt_dma_buf_alloc).

When the crash happens, it's a page fault here:
memcpy <= mpt_read_cfg_page <= mpt_action

The problematic request:
(kgdb) p *req
$1 = {links = {tqe_next = 0xfffffe000179c390, tqe_prev = 0xfffffe0001798438},
state = 10, index = 511, IOCStatus = 0, ResponseCode = 0,
  serno = 37161, ccb = 0x0, req_vbuf = 0xfffffe0000286e00, sense_vbuf =
0xfffffe0000286fe0, req_pbuf = 2089315840, sense_pbuf = 2089316320,
  dmap = 0x0, chain = 0x0, callout = {c_links = {le = {le_next = 0x0, le_prev =
0xfffffe0001522df8}, sle = {sle_next = 0x0}, tqe = {
        tqe_next = 0x0, tqe_prev = 0xfffffe0001522df8}}, c_time =
3787081006447, c_precision = 1342177187, c_arg = 0xfffff80038870000,
    c_func = 0xffffffff804a4c30 <mpt_timeout>, c_lock = 0xfffffe0001798008,
c_flags = 0, c_iflags = 0, c_cpu = 0}}

We see that index is 511, so this is the last request object with its buffer in
the last 512 bytes of the contiguous buffer.
We see that the page fault happens right beyond the allocated buffer region.

So, my interpretation is that RequestFrameSize that's reported by the
[emulated] hardware is too large to be handled by the hardcoded request buffer
size.  The problem is masked for all buffers but the last, because the hardware
would simply overwrite the next request buffer and the driver would read from
it.  So, no page fault although there is a chance of silent data corruption.
For the last buffer there is obviously no next buffer and we get the page
fault.

Conclusions:
- first of all, the driver should check MPT_RQSL against MPT_REQUEST_AREA and
refuse to attach if the request frame size is too large
- we can consider bumping MPT_REQUEST_AREA to, e.g., 1024... probably better to
check
- Linux driver seems to cap the request size at 128 bytes:

http://elixir.free-electrons.com/linux/latest/source/drivers/message/fusion/mptbase.c#L3214

-- 
You are receiving this mail because:
You are the assignee for the bug.