[Bug 45546] Magic The Gathering Arena updater: Unity fork of Mono-runtime reports 'Fatal error in gc, GetThreadContext failed' ( suspension of Mono-managed threads sometimes exceed 100ms)

wine-bugs at winehq.org wine-bugs at winehq.org
Fri Jan 18 12:40:08 CST 2019


https://bugs.winehq.org/show_bug.cgi?id=45546

--- Comment #20 from Ian Sabourin <sslasher0 at gmail.com> ---
A discussion took place on irc yesterday, which I'll try my best to synthesize
here. The main participants were Zebediah Figura, 'ken', and myself - with some
input from julliard, and a few other comments. Apologies if I forgot anything.

Thread A requests the context of thread B, by calling get_thread_context(). The
Microsoft API stipulates that prior to this, B must be suspended by the client
program.

Regardless of whether this actually happened, it seems that the wine server is
unable to produce the thread context, until B is suspended - since it is in
fact B that writes its context, as part of cooperatively suspending itself. To
further complicate things, it seems that the wine server has no means of
forcibly suspending a thread. These are ken's comments, which I don't know
enough to comment on.

If that's the case, the wine server has no choice but to request B's
suspension, and hope B eventually cooperates. In the mean time, all the server
can do is return 'PENDING'.

Correspondingly, the DLL code (in thread A) periodically retries asking the
server. The question now is, what if B never suspends? Maybe it stopped
executing, or maybe another thread C resumed it. These would be examples of
incorrect client programs, and we could say, let the client program hang, if it
does this. But the argument was: what if A is a debugger? Then we don't want it
to hang forever in get_thread_context(), just because B doesn't suspend as
asked.

Zebediah put forth the (good) idea of having the server signal thread A,
instead of having A poll the server. But ultimately this doesn't make it any
more certain that B will eventually suspend.

As a result, if the server has no means of forcibly suspending a thread, and if
we also we want native debuggers not to hang in this scenario, there must be a
timeout in get_thread_context().

The current problem is that the timeout occurs, but really thread B was just
'legitimately' taking a long time to suspend (quoted because a slow suspend
probably indicates some problem in the client code, but that's in a sense
irrelevant here). As a practical solution, julliard suggested "an exponential
backoff over a few seconds". I took that to mean that the polling of the server
starts out quick, and then slows down, to limit server contention when threads
are slow to suspend. Nevertheless, there would remain the question of what
timeout to select, which seems very arbitrary.

Before settling on that (a longer timeout), I'd like to ask two things:
1. does the server have absolutely no way of forcibly suspending a thread, and
then returning a context for it, even if this context is 'invalid'? Why not,
exactly? This could open up different solutions;
2. is it absolutely required that a native Windows debugger not hang in this
degenerate scenario, when running on wine? Could there not be a custom debugger
that targets the wine environment? It could speak directly to the wine server,
instead of going through the MS API, which in this case already does not match
the reality of the wine environment.

-- 
Do not reply to this email, post in Bugzilla using the
above URL to reply.
You are receiving this mail because:
You are watching all bug changes.



More information about the wine-bugs mailing list