[PATCH] wineusb.sys: Remove device from the device list on PDO removal

Zebediah Figura (she/her) zfigura at codeweavers.com
Fri Aug 20 20:43:59 CDT 2021


On 8/2/21 3:04 PM, Zebediah Figura (she/her) wrote:
> On 8/2/21 11:59 AM, Torge Matthies wrote:
>> Fixes a crash when shutting down a prefix.
>>
>> Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=51479
>> Signed-off-by: Torge Matthies <openglfreak at googlemail.com>
>> ---
>>   dlls/wineusb.sys/wineusb.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/dlls/wineusb.sys/wineusb.c b/dlls/wineusb.sys/wineusb.c
>> index fae297915fc..28a36b9f380 100644
>> --- a/dlls/wineusb.sys/wineusb.c
>> +++ b/dlls/wineusb.sys/wineusb.c
>> @@ -413,12 +413,15 @@ static NTSTATUS pdo_pnp(DEVICE_OBJECT 
>> *device_obj, IRP *irp)
>>               break;
>>           case IRP_MN_REMOVE_DEVICE:
>> +            EnterCriticalSection(&wineusb_cs);
>>               remove_pending_irps(device);
>>               libusb_unref_device(device->libusb_device);
>>               libusb_close(device->handle);
>> +            list_remove(&device->entry);
>>               IoDeleteDevice(device->device_obj);
>> +            LeaveCriticalSection(&wineusb_cs);
>>               ret = STATUS_SUCCESS;
>>               break;
>>
> 
> This looks wrong; the device could have been removed by 
> remove_usb_device().
> 
> [1] implies that we actually shouldn't remove the device at all unless 
> we reported it missing via IRP_MN_QUERY_DEVICE_RELATIONS, which probably 
> implies that we shouldn't remove and delete the device here, but should 
> instead remove (and delete?) it from the parent 
> IRP_MN_SURPRISE_REMOVAL/IRP_MN_REMOVE_DEVICE. It's not particularly 
> clear though. I'll have to run some tests and check native drivers, but 
> I won't have access to those until next week.
> 
> [1] 
> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-device-in-a-bus-driver 
> 
> 

(TL;DR: Lots of research and observation follows. I'm currently writing 
patches that solve the initial bug.)

I've done more research, and the situation is horrible. I think this is 
not only the worst API I've ever seen, but it would be the worst even if 
I wasn't considering the fact that it's a kernel API.

The first, horrible thing, is that despite what I said in [2], 
IRP_MN_REMOVE_DEVICE isn't actually guaranteed to be the last IRP 
received. It's only guaranteed to be the last IRP under certain 
conditions, which is why [1] instructs to retain the PDO if it hasn't 
actually been physically removed (*and* that this fact has been reported 
to the system).

In particular, apparently if you expose a child PDO with RawDeviceOK set 
to 1 but which has a function driver, Windows will send 
IRP_MN_START_DEVICE, then IRP_MN_QUERY_REMOVE_DEVICE, then 
IRP_MN_REMOVE_DEVICE, then attach a function driver and restart the same 
device object. Which means that you can't actually delete the device.

What then becomes unclear is what you're supposed to do if the parent 
device was deleted (which is, essentially, what we do at shutdown as 
well. I don't know if Windows does this at shutdown; I remember trying 
to test and being unable to conclude that Windows does or doesn't try to 
tear down the device tree—if it does, it does so late that it's 
impossible to log this fact anywhere).

According to [3] a function driver is supposed to defer deleting the 
child PDO even when the child PDO has received IRP_MN_REMOVE_DEVICE, and 
instead delete it in the parent FDO's IRP_MN_REMOVE_DEVICE. You'd think 
that the child could take its cue from IRP_MN_SURPRISE_REMOVAL that it 
can safely delete itself, but no.

The "toaster" driver sample from the Windows SDK, which has been my 
source of a lot of this, corroborates this—it deletes all child PDOs in 
the parent FDO's IRP_MN_REMOVE_DEVICE. However, it also says this:

         // Typically the system removes all the  children before
         // removing the parent FDO. If for any reason child Pdos are
         // still present we will destroy them explicitly, with one 
exception -
         // we will not delete the PDOs that are in 
SurpriseRemovePending state.

And it also does something which I can't find documented anywhere: in 
the parent FDO's IRP_MN_SURPRISE_REMOVAL it marks all child devices as 
"reported missing" so that they *are* deleted in the (child PDO) 
IRP_MN_REMOVE_DEVICE. It also removes them from the device list so that 
they aren't reported in a subsequent call to 
IRP_MN_QUERY_DEVICE_RELATIONS. It doesn't actually call 
IoInvalidateDeviceRelations() so this happens the "usual" way—no, it 
just pretends that the PnP manager already knows.

And then I tried writing some tests with native hidclass to see what 
that does, and as far as I can tell it doesn't do that thing: it keeps 
reporting the child in response to IRP_MN_QUERY_DEVICE_RELATIONS, even 
if I call that IRP from the (child PDO) IRP_MN_REMOVE_DEVICE. And it 
doesn't seem to delete the child PDO either, even as late as the 
minidriver FDO's IRP_MN_REMOVE_DEVICE. (Maybe it calls the minidriver 
IRP_MJ_PNP handler and then deletes children? It's basically impossible 
to confirm that it does delete the children and doesn't leak them...)

A different driver sample, "gameenum" from the Windows XP DDK, seems to 
match the documentation and hidclass behaviour. It doesn't delete 
anything in FDO IRP_MN_SURPRISE_REMOVAL, and doesn't delete anything in 
the PDO removal IRPs, but deletes all of the child PDOs in the FDO 
IRP_MN_REMOVE_DEVICE. It also keeps reporting devices in 
IRP_MN_QUERY_DEVICE_RELATIONS until the FDO IRP_MN_REMOVE_DEVICE.

Ultimately I think it might not even matter all that much. I think we 
could get away with removing the device in the PDO IRP_MN_REMOVE_DEVICE 
and things would still work. I also checked and Windows seems to ignore 
if IoInvalidateDeviceRelations() is called while removing a device; 
which is especially important for Wine as we can reasonably get hotplug 
events while trying to shut down the system. That does mean that we 
won't always get IRP_MN_REMOVE_DEVICE for a child device, though, which 
means that we will always need to clean up leftover child devices in the 
FDO IRP_MN_REMOVE_DEVICE. Given that it may not be a bad idea to just 
always clean up child devices there.

That's roughly how we handled things before 84c9206ca. That commit 
reused "removed" for a second purpose—to finish and prevent any pending 
IRPs so that a child or grandchild device wouldn't get stuck waiting for 
them in IRP_MN_REMOVE_DEVICE (which of course executes after parent 
SURPRISE_REMOVAL but before parent REMOVE_DEVICE). I think it would be 
reasonable to also unlink the device in the child (PDO) 
IRP_MN_SURPRISE_REMOVAL, and just continue to reuse "removed" for 
multiple purposes.

We could also follow closer to the documentation and the hidgame sample, 
though, and use separate per-device flags to track whether the device 
has been unplugged and whether it is delete-pending (as it were). It's 
not really clear to me that we need to, though maybe it's better for 
clarity anyway...

ἔρρωσθε,
Zebediah

[1] 
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-device-in-a-bus-driver

[2] https://www.winehq.org/pipermail/wine-devel/2021-June/189773.html

[3] 
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-device-in-a-function-driver



More information about the wine-devel mailing list