[PATCH 4/6] ntoskrnl.exe: Send IRP_MN_SURPRISE_REMOVAL before removing children.

Zebediah Figura (she/her) zfigura at codeweavers.com
Mon Jun 21 23:08:08 CDT 2021


On 6/21/21 12:29 PM, Rémi Bernon wrote:
> On 6/18/21 11:06 PM, Zebediah Figura (she/her) wrote:
>> On 6/18/21 1:49 PM, Rémi Bernon wrote:
>>> On 6/18/21 5:59 PM, Zebediah Figura (she/her) wrote:
>>>> On 6/18/21 7:06 AM, Rémi Bernon wrote:
>>>>> So that mini driver gets notified first and has a chance to cancel
>>>>> pending IRPs.
>>>>
>>>> Which minidriver?
>>>>
>>>> Isn't it the responsibility of the child to terminate all pending IRPs
>>>> (and disallow further ones) on removal? See also [0], [1], [2], which
>>>> say that queued requests should be completed in (both, for some reason)
>>>> IRP_MN_SURPRISE_REMOVAL and IRP_MN_REMOVE_DEVICE.
>>>>
>>>> I think this deserves tests.
>>>>
>>>> [0]
>>>> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-device-in-a-function-driver
>>>>
>>>>
>>>>
>>>> [1]
>>>> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an-irp-mn-surprise-removal-request
>>>>
>>>>
>>>>
>>>> [2]
>>>> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-device-in-a-bus-drive
>>>>
>>>>
>>>>
>>>
>>> I really don't have the big picture yet, so I meant the one implemented
>>> in driver_hid.c. If it doesn't cancel the IRP it has queued on device
>>> removal, the test never completes the device removal on Windows.
>>>
>>> On Wine, driver_hid never receives the IRP_MN_SURPRISE_REMOVAL if we
>>> don't call it before removing the "children" in ntoskrnl.exe. I don't
>>> really understand who are the children there, but driver_hid apparently
>>> isn't.
>>>
>>
>> Allow me to explain in detail. Apologies if any of this is old news to
>> you. If nothing else I'll adapt it to a wiki page later ;-)
>>
>>
>> TLC PDO
>>                       winehid.sys + hidclass.sys
>> HID FDO
>> -------------------------------------------------
>> HID PDO
>>                       winebus.sys
>> root FDO
>> -------------------------------------------------
>> root PDO
>>                       ntoskrnl.exe
>> [PNP manager]
>>
>>
>> In the case of the HID test driver it's a little different:
>>
>>
>> TLC PDO
>>                       driver_hid.dll + hidclass.sys
>> root/HID FDO
>> -------------------------------------------------
>> root/HID PDO
>>                       ntoskrnl.exe
>> [PNP manager]
>>
>>
>> A root PNP device is the one passed to AddDevice for a root PNP driver.
>> Usually it's intended to enumerate children; here we kind of cheat and
>> make the root PnP device the same device as the HID device. It's a bit
>> weird but it works.
>>
>> The TLC (top-level collection) PDO is managed entirely within
>> hidclass.sys; the minidriver never interacts with it. In theory there's
>> supposed to be one or more of these; we don't handle the "or more" case
>> yet.
>>
>> The application only ever interacts with the TLC PDO. It's actually
>> impossible to open the HID device stack (IIRC it fails with
>> STATUS_UNSUCCESSFUL). In response to read/write/hidd ioctls, hidclass
>> creates new irps and sends them to the HID device stack, i.e. the "HID
>> FDO/HID PDO" pair in the above diagrams.
>>
>> [Note that IRPs are always sent to the top of a stack first, so in this
>> case the minidriver has a chance to handle it. In a real HID driver the
>> minidriver would always handle it itself, rather than sending it down to
>> the PDO. You can mentally replace winehid with hidusb, and winebus with
>> (say) usbccgp. Obviously usbccgp doesn't know anything about HID. Our
>> architecture is a little weird, though, for various reasons.]
>>
>> There is usually no TLC FDO, but in theory there could be, and in
>> practice I believe mouse and keyboard drivers attach to the TLC PDO.
>>
>> On reflection, I think I may have confused myself, so this patch
>> actually makes more sense than it initially did. It'd still be nice to
>> have tests, though, and the commit message should ideally be more specific.
>>
> 
> Thanks, this is definitely not old news, although it's still quite blurry.
> 
>   From [1] I understand that the hid minidriver lives at the HID FDO
> level, which means that (minidriver, hidclass.sys) act as the bus FDO
> driver here.

Yep, that's correct.

> Reading [2] or [3] doesn't make it completely explicit if the children
> should receive the removal IRP first, but from what I understand it is
> how it's generally supposed to work.

So [4] actually explicitly says this. It's also something I gleaned from 
trying to run some driver, I think it may have even been the toaster sample.

I added some quick tests here, and they seem to confirm it.

> With that understanding I now believe it's correct to call the IRP on
> the children of the device first.
> 
> The problem I'm facing here comes from the way hidclass.sys PDO handles
> its removal. It waits for the reading thread to complete, which may be
> waiting on IRPs queued in the HID FDO.
> 
> We could either:
> 
> 1) call the surprise removal first on the device stack, then call the
> normal removal on the stack, giving a chance for the HID FDO to cancel
> its queued IRPs.
> or,
> 
> 2) combine the waits between halt_event and the pending IRP in the HID
> PDO device thread, so that the thread can terminate even with a pending IRP.

So it turns out that, as far as I can tell, the intended solution is 
mostly (1). Device removal is extremely complicated in Windows, and I've 
already gotten it wrong multiple times, but after some extensive reading 
and testing here's what I've found:

* When a device is reported missing via IoInvalidateDeviceRelations, 
IRP_MN_SURPRISE_REMOVAL is sent. And it turns out that this patch is 
sort of correct: it is indeed sent to the parent before 
IRP_MN_REMOVE_DEVICE is sent to the child. However, children are still 
supposed to receive IRP_MN_SURPRISE_REMOVAL before their parents.

It's a bit tricky to actually test this, though. You need a function 
driver, and that made my tests a lot more difficult (it wreaks havoc 
with WM_DEVICECHANGE for some reason). So I'm willing to retract my 
comment demanding more tests.

* And it is also correct that a lower (parent) device is supposed to 
complete any IRP under its jurisdiction during IRP_MN_SURPRISE_REMOVAL. 
[2] prescribes this, and the toaster sample bears it out.

* When the parent device is removed via SetupDiRemoveDevice, it does not 
receive IRP_MN_SURPRISE_REMOVAL. Instead, it receives 
IRP_MN_QUERY_REMOVE. This is sent recursively, just like 
IRP_MN_SURPRISE_REMOVAL, and if successful IRP_MN_REMOVE_DEVICE is sent.

* Microsoft doesn't document that you're supposed to complete IRPs in 
IRP_MN_QUERY_REMOVE. But a quick test shows that hidclass depends on it, 
and all of the few samples in the Windows 7 WDK that I checked seem to 
do so as well. I'll have to check if the native third-party HID driver I 
have does as well...

> Unrelated, but I'm not sure to understand why the HID PDO has to
> re-create IRPs every time it needs to call the minidriver (except for
> the device thread of course), wouldn't it be possible to just use the
> normal IRP propagation down the stack?

The tricky thing about it is that it's not the same stack. There's a 
driver stack, and then there's *the* driver stack, and unfortunately I 
don't even think that Microsoft uses different terminology to refer to 
either.

Generally speaking, an IRP can only be sent down a single 
FDO->(filter)->PDO chain. But the PDO might need to make some query 
about the driver stack below it, e.g. as a HID device might need to make 
a USB request.

> [1]
> https://docs.microsoft.com/en-us/windows-hardware/drivers/hid/hid-architecture
> 
> [2]
> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an-irp-mn-surprise-removal-request
> 
> [3]
> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-device-in-a-bus-driver
> 

[4] 
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an-irp-mn-remove-device-request



More information about the wine-devel mailing list