Order of device suspend/resume

Justin Hibbits chmeeedalf at gmail.com
Thu Dec 22 19:37:07 UTC 2016


On Dec 16, 2016, at 12:25 AM, Warner Losh wrote:

> On Thu, Dec 15, 2016 at 8:34 PM, Justin Hibbits  
> <chmeeedalf at gmail.com> wrote:
>>
>> On Dec 15, 2016, at 3:38 PM, John Baldwin wrote:
>>
>>> On Thursday, December 15, 2016 11:40:33 AM Roger Pau Monné wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm currently dealing with a bug in the Xen suspend/resume  
>>>> sequence, and
>>>> I've
>>>> found that lacking a way to order device priority during suspend/ 
>>>> resume
>>>> is
>>>> proving quite harmful for Xen (and maybe other systems too). The  
>>>> current
>>>> suspend/resume code simply scans the root bus, and suspends/ 
>>>> resumes every
>>>> device
>>>> based on the order they are attached to their parents. The  
>>>> problem here
>>>> is that
>>>> there's no way to tell that some devices should be resumed before  
>>>> others,
>>>> for
>>>> example the event timers/time counters/uarts should definitely be  
>>>> resume
>>>> before
>>>> other devices, but that's seems to happens mostly out of chance.
>>>>
>>>> Currently most time related devices are attached directly to the  
>>>> nexus,
>>>> which
>>>> means they will get resumed first, but for example the uart is  
>>>> currently
>>>> attached to the pci bus IIRC, which means it gets resumed quite  
>>>> late. On
>>>> Xen
>>>> systems, this is even worse. The Xen PV bus (that contains all
>>>> Xen-related
>>>> devices) is attached the last one (because it tends to pick up  
>>>> unused
>>>> memory
>>>> regions for it's own usage) and this bus also contains the PV  
>>>> timecounter
>>>> which
>>>> should be resumed _before_ other devices, or else timecounting  
>>>> will be
>>>> completely screwed and things can get stuck in indefinitely long  
>>>> loops
>>>> (due to
>>>> the fact that the timecounter is implemented based on the uptime  
>>>> of the
>>>> host,
>>>> and that changes from host-to-host).
>>>>
>>>> In order to solve this I could add a hack to the Xen resume process
>>>> (which is
>>>> already different from the ACPI one), but this looks gross. I  
>>>> could also
>>>> attach
>>>> the Xen PV timer to the nexus directly (as it was done before),  
>>>> but I
>>>> also
>>>> prefer to keep all Xen-related devices in the same bus for  
>>>> coherency.
>>>> Last
>>>> option would be to add some kind of suspend/resume priorities to  
>>>> the
>>>> devices,
>>>> and do more than one suspend/resume pass. This is more complex and
>>>> requires more
>>>> changes, so I would like to know if it would be helpful for other
>>>> systems, or if
>>>> someone has already attempted to do it.
>>>
>>>
>>> I think Justin Hibbits had some patches to make use of the boot-time
>>> new-bus
>>> passes for suspend and resume which I think would help with this.   
>>> You
>>> suspend
>>> things in the reverse order of boot and resume operates in the  
>>> same order
>>> as
>>> boot.
>>>
>>> --
>>> John Baldwin
>>
>>
>> John is right.  I have a (somewhat abandoned due to time and focus)  
>> branch,
>> https://svnweb.freebsd.org/base/projects/pmac_pmu/ which has the  
>> necessary
>> code working mostly on PowerPC.  The diff can be found at
>> https://reviews.freebsd.org/D203 too.
>
> Cool. Does it have a mechanism similar to the attach code that lets
> you run again at each pass?
>
> Warner

Not exactly.  The code will call the BUS_SUSPEND_CHILD() as it rolls  
back the pass levels, and stop on errors.  The meat is in a rewrite of  
bus_generic_suspend() in that review.

- Justin


More information about the freebsd-arch mailing list