MSI allocation regression, still to be corrected in HEAD and please MFC before release/12.0 gets branched
Scott Long
scottl at samsco.org
Tue Nov 13 18:45:15 UTC 2018
> On Nov 13, 2018, at 11:11 AM, Harry Schmalzbauer <freebsd at omnilan.de> wrote:
>
> Am 13.11.2018 um 19:02 schrieb Scott Long:
>>> On Nov 12, 2018, at 10:03 AM, Harry Schmalzbauer <freebsd at omnilan.de> wrote:
>>>
>>> Am 11.06.2018 um 20:28 schrieb Harry Schmalzbauer:
>>>> Am 05.06.2018 um 19:54 schrieb Scott Long:
>>>> …
>>>>>>>> Late in the 11.2 phase, I identified this commit as a regression for MSI (non-x) alloctaion.
>>>>>>>> I have an idea what probably causes the problem here (INTx allocation, although MSI (and MSI-x) capability):
>>>>>>>> disable_msix is not 0 (I need to disable MSI-x because of ESXi-passthru…).
>>>>>>>>
>>>>>>>> Corresponding lines:
>>>>>>>> {
>>>>>>>> device_t dev;
>>>>>>>> int error, msgs;
>>>>>>>>
>>>>>>>> dev = sc->mps_dev;
>>>>>>>> error = 0;
>>>>>>>> msgs = 0;
>>>>>>>>
>>>>>>>> if ((sc->disable_msix == 0) &&
>>>>>>>> ((msgs = pci_msix_count(dev)) >= MPS_MSI_COUNT))
>>>>>>>> error = mps_alloc_msix(sc, MPS_MSI_COUNT);
>>>>>>>> if ((error != 0) && (sc->disable_msi == 0) &&
>>>>>>>> ((msgs = pci_msi_count(dev)) >= MPS_MSI_COUNT))
>>>>>>>> error = mps_alloc_msi(sc, MPS_MSI_COUNT);
>>>>>>>> if (error != 0)
>>>>>>>> msgs = 0;
>>>>>>>>
>>>>>>>> sc->msi_msgs = msgs;
>>>>>>>> return (error);
>>>>>>>> }
>>>>>>>>
>>> …
>>>>>>> Hi Harry,
>>>>>>> You are correct about the bug. Please change the line at the top of the function that reads
>>>>>>> error = 0;
>>>>>>> to
>>>>>>> error = ENXIO;
>>>>>>> Let me know if that fixes the MSI problem for you.
>>>>
>>>> …
>>>>
>>> …
>>>> Index: src/sys/dev/mps/mps_pci.c
>>>> ===================================================================
>>>> --- sys/dev/mps/mps_pci.c (Revision 334948)
>>>> +++ sys/dev/mps/mps_pci.c (Arbeitskopie)
>>>> @@ -244,7 +244,7 @@
>>>> int error, msgs;
>>>>
>>>> dev = sc->mps_dev;
>>>> - error = 0;
>>>> + error = ENXIO;
>>>> msgs = 0;
>>>>
>>>> if ((sc->disable_msix == 0) &&
>>>>
>>>
>>> To my understanding, it's obvious that the way mps_pci_alloc_interrupts() currently works is unintended.
>>> This might not affect too many people, but is there a reason not to fix it?
>>>
>>> I already created a coresponding problem report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229267
>>> Anything else I should do?
>>>
>> Hi Harry,
>> Sorry for ignoring this for so long. I’m going to commit a fix today, but it won’t be the same one-line change.
>> Upon reviewing the code, I’d going to refactor it so it’s not so confusing and prone to these kinds of mistakes.
>> Thank you for the continued reminders to finish this.
>
> Hi Scott,
>
> thanks a lot, in fact I'm not surprised that you come up with a better solution than that quick fix :-)
> Had hoped someone else would do an intermediate commit to get it into 12.0 in time, so you won't feel any time pressure - good job needs the time it needs, as long as the right person is doing the job.
>
> Unfortunately I don't have a non-productive setup where I could test before release/12.0 will be branched – might be subject to change...
12.0 has completely different code from 11.x, and from my review of it last night it should be fine. If you have evidence that what’s currently in 12 is not working, please let me know ASAP.
Scott
More information about the freebsd-stable
mailing list