[PATCH 01/15] server: Wait before suspending threads in APC.

Rémi Bernon rbernon at codeweavers.com
Tue Jan 28 07:43:09 CST 2020


On 1/28/20 2:09 PM, Jacek Caban wrote:
> On 27.01.2020 23:19, Rémi Bernon wrote:
>> On 1/27/20 10:58 PM, Jacek Caban wrote:
>>> Hi Rémi,
>>>
>>>
>>> On 27/01/2020 13:07, Rémi Bernon wrote:
>>>> The APC_BREAK_PROCESS call creates a thread that raises an exception,
>>>> suspending the whole process. There's a race condition between this
>>>> thread and the APC thread notifying the server of its completion.
>>>
>>>
>>> What's wrong about that? Suspending process on thread creation 
>>> shouldn't be a problem. The debugger should process this event, which 
>>> will resume the new thread and let it continue to the break point. 
>>> Before introducing APC_BREAK_PROCESS I had to fix it in another place 
>>> in winedbg, see commit 1bb98982d682a85b82e, maybe there is similar 
>>> problem in gdb proxy?
>>>
>>>
>>> Thanks,
>>>
>>> Jacek
>>>
>>
>> In gdbproxy mode, winedbg calls DebugBreakProcess whenever Ctrl-C is 
>> pressed, then WaitForDebugEvent for the corresponding event.
>>
>> However *sometimes* DebugBreakProcess never returns and it cannot wait 
>> for the event and process it unless it is done in another thread 
>> -which would require changes in gdbproxy I would rather avoid. That 
>> may be how winedbg works in frontend mode?
>>
>> The reason why DebugBreakProcess sometimes returns but not always is 
>> that it requests the APC, which completion notification is sometimes 
>> received by the server before the exception is raised and all threads 
>> are suspended, but sometimes it isn't. I don't think this was 
>> intended, and it would probably be better if DebugBreakProcess returns 
>> -or not- in a consistent way.
> 
> 
> OK, I see, there is a problem. But I think that with your patch the race 
> is still there, it's just much less likely. Server may have already sent 
> the signal before client sent select request, but it may be blocked 
> until server call is finishes. Then, if select returns system APC, it 
> will be immediately interrupted by the signal.
> 
> 
> Jacek
> 

Let's name these locations:

[A]:
@@ -1591,6 +1595,7 @@ DECL_HANDLER(select)
              if (apc->call.type != APC_NONE &&
                  (reply->apc_handle = alloc_handle( current->process, 
apc, SYNCHRONIZE, 0 )))
              {
+                current->in_apc = 1;
                  reply->call = apc->call;
                  release_object( apc );
                  break;

[B]:
@@ -1575,6 +1577,8 @@ DECL_HANDLER(select)
          wake_up( &apc->obj, 0 );
          close_handle( current->process, req->prev_apc );
          release_object( apc );
+        current->in_apc = 0;
+        stop_thread_if_suspended( current );
      }

      reply->timeout = select_on( &select_op, op_size, req->cookie, 
req->flags, req->timeout );

[C]:
@@ -583,6 +584,7 @@ static void set_thread_info( struct thread *thread,
  /* stop a thread (at the Unix level) */
  void stop_thread( struct thread *thread )
  {
+    if (thread->in_apc) return;  /* currently doing apc, will be 
suspended on return */
      if (thread->context) return;  /* already inside a debug event, no 
need for a signal */
      /* can't stop a thread while initialisation is in progress */
      if (is_process_init_done(thread->process)) send_thread_signal( 
thread, SIGUSR1 );



When the APC_BREAK_PROCESS executes, a servicing thread calls the select 
request, which returns through [A], flagging the thread.

It returns from the select request, and does the APC call -which creates 
the exception thread. Then, I see two possible ordering:

1) The servicing thread calls first the select request to notify APC 
completion, goes to [B] which wakes the APC caller -resuming 
DbgBreakProcess on the debugger side, and unflags the thread but it's 
not suspended yet. The exception thread does the queue_exception_event 
request, which ultimately goes to [C] when suspending the process. The 
servicing thread is not flagged, so it is suspended.

2) The exception thread does the queue_exception_event first, which goes 
to [C], but as the servicing thread is still flagged, do not suspend it. 
The servicing thread then does the select request, goes to [B] which 
wakes the APC caller -resuming DbgBreakProcess- then unflags and 
suspends it as it should be.

Now I agree there could still be some race conditions in case 2):

* The additional thread suspension in [B] is done after the wake_up for 
the APC caller, it should probably be done before, just to be sure.

* If the debugger is calling WaitForDebugEvent in parallel to the 
DbgBreakProcess call. In this case, the debugger could maybe be notified 
of a debug event after the exception thread created it but before the 
servicing thread notifies of the APC completion. I don't if that's 
supposed to be supported, or how bad it is to have the servicing thread 
still alive for a bit.

If we want to have the later scenario be race-free, it would require to 
link the debug even only when the servicing thread is suspended. Or, to 
synchronize the servicing and exception thread somehow so that the 
servicing always does its select request first.
-- 
Rémi Bernon <rbernon at codeweavers.com>



More information about the wine-devel mailing list