ideas and questions for implementation of MessageMode in Named Pipes

Thu Feb 5 12:05:58 CST 2009

On Thu, Feb 5, 2009 at 4:08 PM, Juan Lang <juan.lang at gmail.com> wrote:
>> i think i've finally come up with an idea that i believe will work:
>> double-socketing.
> (snip)
>> it'll be ok (and desirable) to allow multiple "readers" of the "read"
>> socket. what you _don't_ want is more than one reader trying to
>> indicate "please start sending a new message" whilst there are other
>> reader(s) still grabbing the previous message.
>>
>> so i believe that a critical section (copying the style of the code
>> around server_get_unix_fd) each around "please start a new message"
>> and "please send some more read-data" would be sufficient.
>
> Out of curiosity, why is this better than a single socket with the
> length of each message prepended?

 a length prepended is still needed.

 what i had implemented so far (and demonstrated that it's flawed,
thanks to the "threadread" test) is:

 * writes send length-prepended to data
 * wineserver-function get_named_pipe_info
 * wineserver-function read_named_pipe

then:

* in the client (ntdll) instead of using read() you use
server_read_named_pipe() iff it's a pipe.  BUT, before you do that,
you do a poll() on the unix_handle (obtained using
server_get_unix_fd()).  this is your "blocking mode".  so, although
you tell the _server_ to get the data for you, you _still_ have to do
"block on socket".  and, because you can't use read() to do the
"blocking", you have to use poll() instead.

* in the wineserver-function, read_named_pipe MUST NOT EVER block on
reads, but it is still being asked to perform _a_ read, and so there
is some code that sets O_NONBLOCK, followed by a recv using MSG_PEEK
to double-check that there's data, and _then_ a read is used to
actually obtain the data.

the problem here is that setting O_NONBLOCK in wineserver, in order to
not end up permanently hanging wineserver, is *interfered with* by the
requirement to have "blocking mode" in the client (ntdll).

so there are two conflicting requirements, by using the same filedescriptor.

and no, you _can't_ do "everything in wineserver", because you _still_
need a mechanism to be able to tell the client (ntdll) to "block".

which is why i asked if there was a way for wineserver to tell a
client wine thread/process to "go to sleep" [and didn't get an
answer].

> One difficulty I'm having trouble getting my head around is, isn't the
> data removed from a socket once it's ready by any process?

 or a thread - yes.  fortunately, read() and write() to/from sockets
are at least atomic (whew).

>  Or does it
> remain for each process to read independently?

 no, thank god.  the data is removed.  except if you use recv() with
MSG_PEEK, of course, but then the data is _guaranteed_ not to be
removed.  ever.

> I guess I'm not that
> familiar with how reading from a socket works.  I'd always assumed
> that the former was true, and that therefore the only correct approach
> would be to buffer message-mode named pipe data in the server.

 "relax, luther - it's much worse than you think" [Mission Impossible I]

 :)

> This
> is ugly (and slow),

 well... it's unfortunate, but that's the way it's going to have to
be.  if i was told "there's these absolutely fantastic
guaranteed-message-size, guaranteed-message-order characteristics you
can get from message-mode named pipes, but they're a bit slower than
normal pipes" when developing an application i'd go "great!  i don't
care if it's a bit slower, the features make my life a _lot_ easier".

 _somewhere_ there has to be a "break" between "messages".

and you can't stop people from sending data down the pipe (in NtWritePipe).

therefore, logically, you have to have a "firewall" between "send" and
"receive".  and, because you _also_ need "blocking on read"
characteristics (across multiple processes and threads), the most
sensible way to implement that is with a unix filedescriptor on which
all of the clients (ntdll) can block.

_but_.. if you also allow any _other_ process to send data down the
_same_ unix filedescriptor (from NtWritePipe), you've just gone and
screwed with the message-boundaries.

... actually, it may turn out to be the case that you literally need
one filedescriptor _per message_.  not joking about, or anything, but
it may end up being the case that a queue of "struct fd*" pointers is
required (in wineserver).

 in non-message-mode, that would simply be "queue length of 1".  in
message-mode, you'd go "oh, dear, we got an EPIPE error when trying to
read, that means that someone else got that message, wow big deal,
let's grab another unix filedescriptor from wineserver and try again".
 if grabbing another filedescriptor fails, THEN you go "oh, whoops,
let's return STATUS_PIPE_DISCONNECTED".

by having one filedescriptor per message, you are cast-iron
guaranteeing 100% that individual messages will not interfere with
each other.

and, the neat thing is, you wouldn't need any "buffering".  you'd
still need to indicate the length (and the easiest way to do that is
_still_ to send it as the first 4 bytes).

> which is why Alexandre's stated preference has
> been to push it into the Linux kernel.

 well, here's the thing: you can't _guarantee_ that wine will be
*exclusively* running on the latest-and-greatest version of the linux
kernel, and, also, it would be a bit unfair to the FreeBSD folks (and
anyone else who would like to port wine to other OSes)

so, this still has to be done in userspace, with an optimisation being
"use a kernelspace implementation, if it exists".

plus, i think also it would help the reactos guys out because they're
in the middle - they will still need something that doesn't rely on a
linux kernel that they can't have.

> For what it's worth, Steve French has expressed interest in doing just
> that.

 that's veeery good news.  i'll look forward to seeing that implemented.

> He asked for a clear spec for what filename these sockets
> should have, but we didn't have a clear app that we needed to fix, so
> we never followed through.  You might approach him again.  He's the
> maintainer of the CIFS kernel module for Linux, and he already has his
> own named pipe implementation.

 _cool_.

  well, here's where i should explain what my goal is.  my goal is to
"get things started".  these days i tend to get involved in free
software projects at "critical juncture" points, where there is
clearly cross-project non-communication and/or non-cooperation
(accidental or otherwise), and where it's _really_ important that the
stuff actually gets done, but is sufficiently complex and
misunderstood that nobody really wants to tackle it.

 message-mode namedpipes falls into that category, which is why i'm doing it.

 so - once i have _a_ working implementation, then please do not be
offended, or surprised, if, when asked to make further enhancements,
or if asked to fit specific criteria (such as doing things a "slightly
different way"), i decline to do so.  [advance notice: any such
requests should also be accompanied by an offer of financial or other
compensation.] but - of course, anyone who finds that the "working
implementation" _isn't_ working, that's a _completely_ different
matter and i will immediately act to remedy that (if noone else does).

also from that strategic perspective, much as i would love to
collaborate with steven on a kernel-level implementation of named
pipes, i believe that it's extraneous: it's an optimisation.  and if i
were to work on that optimisation only, as the _only_ option, it would
lock out other possibilities.

so, on balance, i'll not be contacting steven right now.  that, and
the fact that the samba team decided to block all communication on
16th december 2005, and have neither revoked it nor issued a public
apology for doing so, means that i cannot contact him _anyway_.

l.