[Bug 52401] New: Improper synchronization in sock_recv/sock_send leads to arbitrary reordering of completion of I/O requests

WineHQ Bugzilla wine-bugs at winehq.org
Sat Jan 15 15:25:01 CST 2022


https://bugs.winehq.org/show_bug.cgi?id=52401

            Bug ID: 52401
           Summary: Improper synchronization in sock_recv/sock_send leads
                    to arbitrary reordering of completion of I/O requests
           Product: Wine
           Version: 7.0-rc6
          Hardware: x86-64
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: ntdll
          Assignee: wine-bugs at winehq.org
          Reporter: jinoh.kang.kr at gmail.com
      Distribution: ---

The sock_recv function (dlls/ntdll/unix/socket.c) issues a 'recv_socket' call
to
the server to complete the I/O request.
Prior to calling the server, the function attempts to serve the I/O request
locally first by calling try_recv().

The consequence is that there are two types of asynchronous I/O request
(hereinafter "Async") sent to the server:

1. Eager: try_recv() call has succeeded before 'recv_socket'. There is no more
    I/O to be done.

2. Deferred: the request is still pending. When the server senses an incoming
    packet, async_recv_proc() (in client side) will be called to complete the
    I/O.

The problem is that eager Asyncs are _not_ prioritized before deferred Asyncs.
Therefore, deferred Asyncs may be completed before eagar Asyncs.

The following scenario illustrates the problem:

1. The socket's incoming buffer is initially empty.

2. Client: The application calls WSARecv(). In sock_recv(), try_recv() fails
    with STATUS_DEVICE_NOT_READY; therefore, a deferred Async is queued to the
    server. WSARecv() returns with error ERROR_IO_PENDING.

3. The socket receives packet [A] in the meantime. The socket's incoming buffer
    is no longer empty.

4. Client: The application calls WSARecv() again. In sock_recv(), try_recv()
    succeeds with packet [A]; therefore, an eager Async is queued to the
server.

5. The socket receives packet [B] in the meantime.

6. Server: the poll() loop detects this, and calls async_wake_up(
&sock->read_q,
    status ). This causes APC_ASYNC_IO for deferred Async to be called to the
    client process.

6. Client: While still in the second sock_recv(), the client does wait_async()
    on the returned wait handle. This causes the APC_ASYNC_IO (a system APC)
    to be dequeued.

7. Client (select loop): The client does a server select call. This returns
    STATUS_KERNEL_APC with APC_ASYNC_IO. The client calls try_recv() (from
    async_recv_proc), which succeeds with packet [B]. The client process
    completes the deferred Async with this packet.

8. Client (select loop): The client re-issues the select call after the APC.

8. Server: the wait on the async wait handle is satisfied, causing
    async_satisified() to be called. This in turn calls async_set_result(),
    which completes the eager Async (with packet [A]).

9. Client: The client exits sock_recv() and in turn WSARecv(), which reports
    immediate success.

(Ditto for sock_send and other similar asynchronous I/O requests.)

If the application uses a completion port, it will observe the deferred Async
first, and the eager Async second.  The deferred Async carries packet [B],
while the eager Async carriers packet [A].  This results in the application
receiving the packets in the wrong order.

--

Three possible solutions comes to mind:

1. Replace the call to try_recv()/try_send() in sock_recv()/sock_send() with
   STATUS_DEVICE_NOT_READY.  This may slightly reduce performance, since it
   always defers all I/O requests and forces them to go through the poll()
loop.

2. Make async_handoff() immediately call async_set_result() if the status and
   data are already set (i.e. the I/O has completed synchronously).  Since this
   affects other asynchronous operations as well, I'm not sure this approach
   is semantically correct.

3. Prioritize immediately satiable async wait handles _before_ system APCs when
   waiting for objects.  This approach too changes semantics, and appears much
   uglier than other solutions.

-- 
Do not reply to this email, post in Bugzilla using the
above URL to reply.
You are receiving this mail because:
You are watching all bug changes.



More information about the wine-bugs mailing list