More DCOM notes

Sat Jan 22 14:37:21 CST 2005

Here are some more notes in a convenient to read form. I'll turn these
into a documentation patch later. Rob, can you double-check this?

This document assumes you are familiar with the basics of DCOM. If you
aren't read this first:

http://winehq.com/site/docs/wine-devel/dcom-1

This is not suitable study material for beginners. Don't say I didn't
warn you.

* Apartments

Before a thread can use COM it must enter an apartment. Apartments are
an abstraction of a COM objects thread safety level. There are many
types of apartment but the only two we care about right now are single
threaded apartments (STAs) and the multi-threaded apartment (MTA).

Any given process may contain at most one MTA and potentially many STAs.
You enter an apartment by calling CoInitializeEx and passing the desired
thread model in as a parameter. The default if you use the deprecated
CoInitialize is a STA, and this is the most common type of apartment
used in COM.

An object in the multi-threaded apartment may be accessed concurrently
by multiple threads: eg, it's supposed to be entirely thread safe. It
must also not care about thread-affinity, the object should react the
same way no matter which thread is calling it.

An object inside a STA does not have to be thread safe, and all calls
upon it should come from the same thread - the thread that entered the
apartment in the first place.

The apartment system was originally designed to deal with the disparity
between the Windows NT/C++ world in which threading was given a strong
emphasis, and the Visual Basic world in which threading was barely
supported and even if it had been fully supported most developers would
not have used it. Visual Basic code is not truly multi-threaded, instead
if you start a new thread you get an entirely new VM, with separate sets
of global variables. Changes made in one thread do NOT reflect in
another, which pretty much violates the expected semantics of multi-
threading entirely but this is Visual Basic, so what did you expect? If
you access a VB object concurrently from multiple threads, behind the
scenes each VM runs in a STA and the calls are marshaled between the
threads using DCOM.

In the Windows 2000 release of COM, several new types of apartment were
added, the most important of which are RTAs (the rental threaded
apartment) in which concurrent access are serialised by COM using an
apartment-wide lock but thread affinity is not guaranteed.

* Structure of a marshaled interface pointer

When an interface is marshaled using CoMarshalInterface, the result is a
serialized OBJREF structure. An OBJREF actually contains a union, but
we'll be assuming the variant that embeds a STDOBJREF here which is
what's used by the system provided standard marshaling. A STDOBJREF
(standard object reference) consists of the magic signature 'MEOW', then
some flags, then the IID of the marshaled interface. Quite what MEOW
stands for is a mystery, but it's definitely not "Microsoft Extended
Object Wire". Next comes the STDOBJREF flags, identified by their SORF_
prefix. Most of these are reserved, and their purpose (if any) is
unknown, but a few are defined.

After the SORF flags comes a count of the references represented by this
marshaled interface. Typically this will 1 in the case of a normal
marshal, but may be 5 for table-strong marshals and 0 for table-weak
marshals (the difference between these is explained below).

The most interesting part of a STDOBJREF is the OXID, OID, IPID triple.
This triple identifies any given marshaled interface pointer in the
network. OXIDs are apartment identifiers, and are supposed to be unique
network-wide. How this is guaranteed is currently unknown: the original
algorithm Windows used was something like the current UNIX time and a
local counter. 

OXIDs are generated and registered with the OXID resolver by performing
local RPCs to the RPC subsystem (rpcss.exe). In a fully security-patched
Windows system they appear to be randomly generated. This registration
is done using the ILocalOxidResolver interface, however the exact
structure of this interface is currently unknown.

OIDs are object identifiers, and identify a stub manager. The stub
manager manages interface stubs. For each exported COM object there are
multiple interfaces and therefore multiple interface stubs
(IRpcStubBuffer implementations). OIDs are apartment scoped. Each ifstub
is identified by an IPID, which identifies a marshaled interface
pointer. IPIDs are apartment scoped.

Unmarshaling one of these streams therefore means setting up a
connection to the object exporter (the apartment holding the marshaled
interface pointer) and being able to send RPCs to the right ifstub. Each
apartment has its own RPC endpoint and calls can be routed to the
correct interface pointer by embedding the IPID into the call using
RpcBindingSetObject. IRemUnknown, discussed below, uses a reserved IPID.

Both standard and handler marshaled OBJREFs contains an OXID resolver
endpoint which is an RPC string binding in a DUALSTRINGARRAY. This is
necessary because an OXID alone is not enough to contact the host, as it
doesn't contain any network address data. Instead, the combination of
the remote OXID resolver RPC endpoint and the OXID itself are passed to
the local OXID resolver. It then returns the apartment string binding.

This step is an optimisation: technically the OBJREF itself could
contain the string binding of the apartment endpoint and the OXID
resolver could be bypassed, but by using this DCOM can optimise out a
server round-trip by having the local OXID resolver cache the query
results. The OXID resolver is a service in the RPC subsystem (rpcss.exe)
which implements a raw (non object-oriented) RPC interface called
IOXIDResolver. Despite the identical naming convention this is not a COM
interface.

Unmarshaling an interface pointer stream therefore consists of reading
the OXID, OID and IPID from the STDOBJREF, then reading one or more RPC
string bindings for the remote OXID resolver. Then
RpcBindingFromStringBinding is used to convert this remote string
binding into an RPC binding handle which can be passed to the local
IOXIDResolver::ResolveOxid implementation along with the OXID. The local
OXID resolver consults its list of same-machine OXIDs, then its cache of
remote OXIDs, and if not found does an RPC to the remote OXID resolver
using the binding handle passed in earlier. The result of the query is
stored for future reference in the cache, and finally the unmarshaling
application gets back the apartment string binding, the IPID of that
apartments IRemUnknown implementation, and a security hint (let's ignore
this for now).

Once the remote apartments string binding has been located the
unmarshalling process constructs an RPC Channel Buffer implementation
with the connection handle and the IPID of the needed interface, loads
and constructs the IRpcProxyBuffer implementation for that IID and
connects it to the channel. Finally the proxy is passed back to the
application.

* Handling IUnknown

There are some subtleties here with respect to IUnknown. IUnknown itself
is never marshaled directly: instead a version of it optimised for
network usage is used. IRemUnknown is similar in concept to IUnknown
except that it allows you to add and release arbitrary numbers of
references at once, and it also allows you to query for multiple
interfaces at once.

IRemUnknown is used for lifecycle management, and for marshaling new
interfaces on an object back to the client. Its definition can be seen
in dcom.idl - basically the IRemUnknown::RemQueryInterface method takes
an IPID and a list of IIDs, then returns STDOBJREFs of each new
marshaled interface pointer.

There is one IRemUnknown implementation per apartment, not per stub
manager as you might expect. This is OK because IPIDs are apartment not
object scoped.

* Table marshaling

Normally once you have unmarshaled a marshaled interface pointer that
stream is dead, you can't unmarshal it again. Sometimes this isn't what
you want. In this case, table marshaling can be used. There are two
types: strong and weak. In table-strong marshaling, selected by a
specific flag to CoMarshalInterface, a stream can be unmarshaled as many
times as you like. Even if all the proxies are released, the marshaled
object reference is still valid. Effectively the stream itself holds a
ref on the object. To release the object entirely so its server can shut
down, you must use CoReleaseMarshalData on the stream.

In table-weak marshaling the stream can be unmarshaled many times,
however the stream does not hold a ref. If you unmarshal the stream
twice, once those two proxies have been released remote object will also
be released. Attempting to unmarshal the stream at this point will yield
CO_E_DISCONNECTED.

* RPC dispatch

Exactly how RPC dispatch occurs depends on whether the exported object
is in a STA or the MTA. If it's in the MTA then all is simple: the RPC
dispatch thread can temporarily enter the MTA, perform the remote call,
and then leave it again. If it's in a STA things get more complex,
because of the requirement that only one thread can ever access the
object. 

Instead, when entering a STA a hidden window is created implicitly by
COM, and the user must manually pump the message loop in order to
service incoming RPCs. The RPC dispatch thread performs the context
switch into the STA by sending a message to the apartments window, which
then proceeds to invoke the remote call in the right thread.

RPC dispatch threads are pooled by the RPC runtime. When an incoming RPC
needs to be serviced, a thread is pulled from the pool and invokes the
call. The main RPC thread then goes back to listening for new calls.
It's quite likely for objects in the MTA to therefore be servicing more
than one call at once.

* Message filtering and re-entrancy

When an outgoing call is made from a STA, it's possible that the remote
server will re-enter the client, for instance to perform a callback.
Because of this potential re-entrancy, when waiting for the reply to an
RPC made inside a STA, COM will pump the message loop. That's because
while this thread is blocked, the incoming callback will be dispatched
by a thread from the RPC dispatch pool, so it must be processing
messages.

While COM is pumping the message loop, all incoming messages from the
operating system are filtered through one or more message filters. These
filters are themselves COM objects which can choose to discard, hold or
forward window messages. The default message filter drops all input
messages and forwards the rest. This is so that if the user chooses a
menu option which triggers an RPC, they then cannot choose that menu
option *again* and restart the function from the beginning. That type of
unexpected re-entrancy is extremely difficult to debug, so it's
disallowed.

Unfortunately other window messages are allowed through, meaning that
it's possible your UI will be required to repaint itself during an
outgoing RPC. This makes programming with STAs more complex than it may
appear, as you must be prepared to run all kinds of code any time an
outgoing call is made. In turn this breaks the idea that COM should
abstract object location from the programmer, because an object that was
originally free-threaded and is then run from a STA could trigger new
and untested codepaths in a program. 

Oh well, it was nice in theory.

thanks -mike