[Bug 45546] Magic The Gathering Arena updater: Unity fork of Mono-runtime reports 'Fatal error in gc, GetThreadContext failed' ( suspension of Mono-managed threads sometimes exceed 100ms)

wine-bugs at winehq.org wine-bugs at winehq.org
Mon Oct 1 04:58:55 CDT 2018


https://bugs.winehq.org/show_bug.cgi?id=45546

Anastasius Focht <focht at gmx.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|-unknown                    |ntdll
             Status|UNCONFIRMED                 |NEW
            Summary|Magic The Gathering Arena   |Magic The Gathering Arena
                   |updater: Unity fork of      |updater: Unity fork of
                   |Mono-runtime reports 'Fatal |Mono-runtime reports 'Fatal
                   |error in gc,                |error in gc,
                   |GetThreadContext failed'    |GetThreadContext failed'
                   |                            |(suspension of Mono-managed
                   |                            |threads sometimes exceed
                   |                            |100ms)
     Ever confirmed|0                           |1

--- Comment #8 from Anastasius Focht <focht at gmx.net> ---
Hello folks,

confirming, although very hard to reproduce on my machine.
While trying to reproduce I downloaded around 200 GB (2 GB each try) in the
process .. hope they don't ban me ^^

Installer: https://mtgarena.downloads.wizards.com/Live/Windows32/updates.txt

--- snip ---
[Latest]
Name = MTGArena
ReleaseDate = 09\25\2018
ProductVersion = 0.1.845.0
FilePath = [APPDIR]MTGALauncher.exe
MSIURL =
https://mtgarena.downloads.wizards.com/Installers/Windows32/0.1.845.0/Live/MTGAInstaller.msi
URL = https://mtgarena.downloads.wizards.com/Live/Windows32/MTGAInstaller.exe
Version = 0.1.845.0
BinaryUpdater = false
Localized = false
ProductCode = {3F0CBA48-C470-429F-BDF3-5FEC0940B4EA}
--- snip ---

It seems they now use a newer, partially updated the Mono runtime (bdwgc) for
Unity.

Protection ID scan:

--- snip ---
-=[ ProtectionID v0.6.9.0 DECEMBER]=-
(c) 2003-2017 CDKiLLER & TippeX
Build 24/12/17-21:05:42
Ready...
Scanning -> C:\Program Files (x86)\Wizards of the
Coast\MTGA\MonoBleedingEdge\EmbedRuntime\mono-2.0-bdwgc.dll
File Type : 32-Bit Dll (Subsystem : Win GUI / 2), Size : 3886528 (03B4DC0h)
Byte(s) | Machine: 0x14C (I386)
Compilation TimeStamp : 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT)
[TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | PE Header | - |
Offset: 0x00000120 | VA: 0x10000120 | -
[TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | Export | - |
Offset: 0x0038A1D4 | VA: 0x1038B9D4 | -
[TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | DebugDirectory | -
| Offset: 0x00388D44 | VA: 0x1038A544 | -
[TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | DebugDirectory | -
| Offset: 0x00388D60 | VA: 0x1038A560 | -
[TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | DebugDirectory | -
| Offset: 0x00388D7C | VA: 0x1038A57C | -
-> File Appears to be Digitally Signed @ Offset 03B3600h, size : 017C0h / 06080
byte(s)
[LoadConfig] Struct determined as v8 (Expected size 140 | Actual size 64)
[!] Executable uses SEH Tables (/SAFESEH) (3 calculated 3 recorded... 0 invalid
addresses) 
[LoadConfig] CodeIntegrity -> Flags 0x0 | Catalog 0x0 (0) | Catalog Offset 0x0
| Reserved 0x0
[LoadConfig] GuardAddressTakenIatEntryTable 0x1039A750 | Count 0x1038A608
(272147976)
[LoadConfig] GuardLongJumpTargetTable 0x0 | Count 0x0 (0)
[LoadConfig] HybridMetadataPointer 0x1 | DynamicValueRelocTable 0x1038A618
[LoadConfig] FailFastIndirectProc 0x1038A620 | FailFastPointer 0x0
[LoadConfig] UnknownZero1 0x1039A750
[File Heuristics] -> Flag #1 : 00000100000001001101000100000100 (0x0404D104)
[Entrypoint Section Entropy] : 6.60 (section #0) ".text   " | Size : 0x2C73E6
(2913254) byte(s)
[DllCharacteristics] -> Flag : (0x0140) -> ASLR | DEP
[SectionCount] 6 (0x6) | ImageSize 0x425000 (4345856) byte(s)
[Export] 99% of function(s) (1081 of 1084) are in file | 0 are forwarded | 1078
code | 6 data | 0 uninit data | 0 unknown | 
[ModuleReport] [IAT] Modules -> MSWSOCK.dll | WS2_32.dll | ole32.dll |
OLEAUT32.dll | PSAPI.DLL | VERSION.dll | ADVAPI32.dll | WINMM.dll |
KERNEL32.dll | USER32.dll | SHELL32.dll
[Debug Info] (record 1 of 3) (file offset 0x388D40)
Characteristics : 0x0 | TimeDateStamp : 0x5AF21305 (Tue 08th May 2018 21:13:41
(GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0)
Type : 2 (0x2) -> CodeView | Size : 0x67 (103) 
AddressOfRawData : 0x38A8C4 | PointerToRawData : 0x3890C4
CvSig : 0x53445352 | SigGuid 8B1A1B99-19B1-402D-AA9C2AD249077FA6
Age : 0x1 (1) | Pdb :
C:\buildslave\mono\build\msvc\build\bdwgc\Win32\bin\Release\mono-2.0-bdwgc.pdb
[Debug Info] (record 2 of 3) (file offset 0x388D5C)
Characteristics : 0x0 | TimeDateStamp : 0x5AF21305 (Tue 08th May 2018 21:13:41
(GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0)
Type : 12 (0xC) -> Undocumented | Size : 0x14 (20) 
AddressOfRawData : 0x38A92C | PointerToRawData : 0x38912C
[Debug Info] (record 3 of 3) (file offset 0x388D78)
Characteristics : 0x0 | TimeDateStamp : 0x5AF21305 (Tue 08th May 2018 21:13:41
(GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0)
Type : 13 (0xD) -> Undocumented | Size : 0x320 (800) 
AddressOfRawData : 0x38A940 | PointerToRawData : 0x389140
...
[CompilerDetect] -> Visual C++ 14.0 (Visual Studio 2015)
[!] File appears to have no protection or is using an unknown protection
- Scan Took : 1.806 Second(s) [0000004AAh (1194) tick(s)] [246 of 580 scan(s)
done]

Scanning -> C:\Program Files\Wizards of the Coast\MTGA\UnityPlayer.dll
File Type : 32-Bit Dll (Subsystem : Win GUI / 2), Size : 17423808 (0109DDC0h)
Byte(s) | Machine: 0x14C (I386)
Compilation TimeStamp : 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT)
[TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | PE Header | - |
Offset: 0x00000178 | VA: 0x10000178 | -
[TimeStamp] 0xFFFFFFFF -> Sun 07th Feb 2106 06:28:15 (GMT) | Export | - |
Offset: 0x00FDFC74 | VA: 0x10FE0E74 | -
[TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | DebugDirectory | -
| Offset: 0x00FDA444 | VA: 0x10FDB644 | -
[TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | DebugDirectory | -
| Offset: 0x00FDA460 | VA: 0x10FDB660 | -
[TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | DebugDirectory | -
| Offset: 0x00FDA47C | VA: 0x10FDB67C | -
-> File Appears to be Digitally Signed @ Offset 0109C600h, size : 017C0h /
06080 byte(s)
[LoadConfig] Struct determined as v8 (Expected size 140 | Actual size 64)
[LoadConfig] CodeIntegrity -> Flags 0x0 | Catalog 0x0 (0) | Catalog Offset 0x0
| Reserved 0x0
[LoadConfig] GuardAddressTakenIatEntryTable 0x0 | Count 0x0 (0)
[LoadConfig] GuardLongJumpTargetTable 0x0 | Count 0x0 (0)
[LoadConfig] HybridMetadataPointer 0x0 | DynamicValueRelocTable 0x0
[LoadConfig] FailFastIndirectProc 0x0 | FailFastPointer 0x0
[LoadConfig] UnknownZero1 0x0
[File Heuristics] -> Flag #1 : 00000100000001001101000100000100 (0x0404D104)
[Entrypoint Section Entropy] : 6.64 (section #0) ".text   " | Size : 0xE2D9FC
(14866940) byte(s)
[DllCharacteristics] -> Flag : (0x0140) -> ASLR | DEP
[SectionCount] 7 (0x7) | ImageSize 0x115A000 (18194432) byte(s)
[Export] 100% of function(s) (1 of 1) are in file | 0 are forwarded | 1 code |
0 data | 0 uninit data | 0 unknown | 
[VersionInfo] Product Version : 2018.1.6.5753908
[VersionInfo] File Version : 2018.1.6.5753908
[ModuleReport] [IAT] Modules -> HID.DLL | KERNEL32.dll | USER32.dll |
VERSION.dll | ole32.dll | SHLWAPI.dll | SETUPAPI.dll | ADVAPI32.dll | GDI32.dll
| SHELL32.dll | OPENGL32.dll | WINMM.dll | WS2_32.dll | OLEAUT32.dll |
IMM32.dll | DNSAPI.dll | IPHLPAPI.DLL | WINHTTP.dll
[Debug Info] (record 1 of 3) (file offset 0xFDA440)
Characteristics : 0x0 | TimeDateStamp : 0x5B23C5A7 (Fri 15th Jun 2018 13:56:55
(GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0)
Type : 2 (0x2) -> CodeView | Size : 0x88 (136) 
AddressOfRawData : 0xFDD0EC | PointerToRawData : 0xFDBEEC
CvSig : 0x53445352 | SigGuid E6BD450A-B375-47BC-A34A5DC610712C8C
Age : 0x1 (1) | Pdb :
C:\buildslave\unity\build\artifacts\StandalonePlayer\win32_release_vs2017_nondev\UnityPlayer_Win32_x86_mono.pdb
[Debug Info] (record 2 of 3) (file offset 0xFDA45C)
Characteristics : 0x0 | TimeDateStamp : 0x5B23C5A7 (Fri 15th Jun 2018 13:56:55
(GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0)
Type : 12 (0xC) -> Undocumented | Size : 0x14 (20) 
AddressOfRawData : 0xFDD174 | PointerToRawData : 0xFDBF74
[Debug Info] (record 3 of 3) (file offset 0xFDA478)
Characteristics : 0x0 | TimeDateStamp : 0x5B23C5A7 (Fri 15th Jun 2018 13:56:55
(GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0)
Type : 13 (0xD) -> Undocumented | Size : 0x384 (900) 
AddressOfRawData : 0xFDD188 | PointerToRawData : 0xFDBF88
...
[c] Unity Version 2018.1.6f1_57cc34175ccf detected
[!] File appears to have no protection or is using an unknown protection
--- snip ---

Using the Unity version info I found matching debug symbols (compressed symbol
store)

https://symbolserver.unity3d.com/

https://symbolserver.unity3d.com/UnityPlayer_Win32_x86_mono.pdb/E6BD450AB37547BCA34A5DC610712C8C1/

For the Mono dll itself it wasn't possible to find matching debug symbols hence
I figured out a compatible version which works with the launcher (replaced it):

https://symbolserver.unity3d.com/mono-2.0-bdwgc.dll/5B7AB7E43de000/
https://symbolserver.unity3d.com/mono-2.0-bdwgc.pdb/3B9BBE126F9746AA9EEA4AB6ED69E2071/

The problem seems to be a rare occurrence when a Mono-managed worker thread
fails to suspend in time (at least what Wine thinks is a reasonable amount ->
100ms). During application lifetime there are many "world-stop" garbage
collection events when all registered threads are to be suspended.

Mono itself has some debugging facilities:

--- snip ---
$ export MONO_LOG_DEST=syslog
$ export MONO_LOG_MASK=gc
$ export MONO_LOG_LEVEL=debug

$ export GC_LOOP_ON_ABORT=1
$ export GC_PRINT_VERBOSE_STATS=1 
...
--- snip ---

MTGA.gc.log

--- snip ---
...
Grow heap to 256 KiB after 0 bytes allocated
Initiating full world-stop collection!
0 bytes in heap blacklisted for interior pointers
Found new system malloc AllocationBase at 00110000

--> Marking for collection #1 after 0 allocated bytes
Pushed 1 thread stacks
GC #1 freed 0 bytes, heap 256 KiB (+ 0 KiB unmapped)
World-stopped marking took 1 msecs (1 in average)
Bytes recovered before sweep - f.l. count = 0
In-use heap: 0% (0 KiB pointers + 0 KiB other)
Immediately reclaimed 0 bytes, heapsize: 262144 bytes (0 unmapped)
0 finalization entries; 0/0 short/long disappearing links alive
0 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 1 msecs
Adding block map for size of 170 granules (1360 bytes)
Adding block map for size of 42 granules (336 bytes)
Adding block map for size of 1 granules (8 bytes)
Adding block map for size of 5 granules (40 bytes)
Adding block map for size of 4 granules (32 bytes)
Adding block map for size of 12 granules (96 bytes) 
...
--> Marking for collection #3131 after 12029608 allocated bytes
Pushed 18 thread stacks
GC #3131 freed 1291456 bytes, heap 52444 KiB (+ 0 KiB unmapped)
World-stopped marking took 17 msecs (2 in average)
Bytes recovered before sweep - f.l. count = -174624
In-use heap: 84% (12547 KiB pointers + 31721 KiB other)
Immediately reclaimed -166432 bytes, heapsize: 53702656 bytes (0 unmapped)
899 finalization entries; 9/5 short/long disappearing links alive
88 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 17 msecs
Grow heap to 68828 KiB after 2090736 bytes allocated
Initiating full world-stop collection!
163840 bytes in heap blacklisted for interior pointers

--> Marking for collection #3132 after 2825704 allocated bytes
Pushed 18 thread stacks
GC #3132 freed 1279720 bytes, heap 68828 KiB (+ 0 KiB unmapped)
World-stopped marking took 14 msecs (2 in average)
Bytes recovered before sweep - f.l. count = -119608
In-use heap: 65% (13151 KiB pointers + 32137 KiB other)
Immediately reclaimed 30184 bytes, heapsize: 70479872 bytes (0 unmapped)
909 finalization entries; 9/5 short/long disappearing links alive
142 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 14 msecs
Grew fo table to 8192 entries
Grew fo table to 16384 entries
--- snip --- 

Relevant thread callstacks gathered using winedbg and x32dbg on the rare
failing case (manually annotated):

--- snip ---
Backtracing for thread 0073 in process 0008 (C:\Program Files\Wizards of the
Coast\MTGA\MTGA.exe):
Backtrace:
=>0 0xf7f4a047 __kernel_vsyscall+0x7() in [vdso].so (0x093bea08)
  1 0xf7d178d7 __libc_read+0x66() in libpthread.so.0 (0x093bea08)
  2 0x7bc90ec6 wait_select_reply+0x1e()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/server.c:359] in ntdll
(0x093bea08)
  3 0x7bc91d8a server_select+0x163()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/server.c:618] in ntdll
(0x093beb18)
  4 0x7bc9c208 wait_objects+0xb6()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/sync.c:1033] in ntdll
(0x093bec58)
  5 0x7bc9c246 NtWaitForMultipleObjects+0x38(wait_any=1, alertable=0)
[/home/focht/projects/wine/mainline-src/dlls/ntdll/sync.c:1044] in ntdll
(0x093bec98)
  6 0x7b47ec0d WaitForMultipleObjectsEx+0xad()
[/home/focht/projects/wine/mainline-src/dlls/kernel32/sync.c:221] in kernel32
(0x093bedf8)
  7 0xf7aa284e X11DRV_MsgWaitForMultipleObjectsEx+0xcd()
[/home/focht/projects/wine/mainline-src/dlls/winex11.drv/event.c:497] in
winex11 (0x093bee48)
  8 0x7e8dfa3a wait_message+0x21()
[/home/focht/projects/wine/mainline-src/dlls/user32/winproc.c:1164] in user32
(0x093bee98)
  9 0x7e8a2e98 wait_objects+0xe0()
[/home/focht/projects/wine/mainline-src/dlls/user32/message.c:3080] in user32
(0x093bef48)
  10 0x7e8a45d7 GetMessageW+0xe7()
[/home/focht/projects/wine/mainline-src/dlls/user32/message.c:3841] in user32
(0x093bef88)
  11 0x7e86ba0d DIALOG_DoDialogBox+0xb3()
[/home/focht/projects/wine/mainline-src/dlls/user32/dialog.c:798] in user32
(0x093befe8)
  12 0x7e86bd00 DialogBoxIndirectParamAorW+0x54()
[/home/focht/projects/wine/mainline-src/dlls/user32/dialog.c:875] in user32
(0x093bf018)
  13 0x7e86bd79 DialogBoxIndirectParamW+0x2a()
[/home/focht/projects/wine/mainline-src/dlls/user32/dialog.c:895] in user32
(0x093bf058)
  14 0x7e8a8ee9 MessageBoxIndirectW+0x115()
[/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:541] in user32
(0x093bf0c8)
  15 0x7e8a8d73 MessageBoxIndirectA+0x100()
[/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:503] in user32
(0x093bf148)
  16 0x7e8a8af1 MessageBoxExA+0x66(langid=0)
[/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:419] in user32
(0x093bf1b8)
  17 0x7e8a8a48 MessageBoxA+0x27()
[/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:387] in user32
(0x093bf1f8)
  18 0x0202f8c7 in mono-2.0-bdwgc (+0x25f8c6) (0x093bf220) ;
<mono-2.0-bdwgc.GC_default_on_abort>
  19 0x020389e2 in mono-2.0-bdwgc (+0x2689e1) (0x093bf510) ;
<mono-2.0-bdwgc.GC_push_stack_for>
  20 0x0203800d in mono-2.0-bdwgc (+0x26800c) (0x093bf53c) ;
<mono-2.0-bdwgc.GC_push_all_stacks>
  21 0x020369a5 in mono-2.0-bdwgc (+0x2669a4) (0x093bf554) ;
<mono-2.0-bdwgc.GC_mark_some_inner>
  22 0x0203686b in mono-2.0-bdwgc (+0x26686a) (0x093bf58c) ;
<mono-2.0-bdwgc.GC_mark_some>
  23 0x0203a7fe in mono-2.0-bdwgc (+0x26a7fd) (0x093bf5a8) ;
<mono-2.0-bdwgc.GC_stopped_mark>
  24 0x0203ada1 in mono-2.0-bdwgc (+0x26ada0) (0x093bf5c0) ;
<mono-2.0-bdwgc.GC_try_to_collect_inner>
  25 0x02031689 in mono-2.0-bdwgc (+0x261688) (0x093bf5d4) ;
<mono-2.0-bdwgc.GC_collect_or_expand>
  26 0x020308bf in mono-2.0-bdwgc (+0x2608be) (0x093bf5f4) ;
<mono-2.0-bdwgc.GC_allocobj>
  27 0x02033f7c in mono-2.0-bdwgc (+0x263f7b) (0x093bf610) ;
<mono-2.0-bdwgc.GC_generic_malloc_inner>
  28 0x02033a40 in mono-2.0-bdwgc (+0x263a3f) (0x093bf62c) ;
<mono-2.0-bdwgc.GC_gcj_malloc>
  29 0x01ebd012 in mono-2.0-bdwgc (+0xed011) (0x093bf648)  ;
<mono-2.0-bdwgc.mono_gc_alloc_obj>
  30 0x01e6db9b in mono-2.0-bdwgc (+0x9db9a) (0x093bf664)  ;
<mono-2.0-bdwgc.mono_object_new_alloc_specific_checked>
  31 0x01e6df7c in mono-2.0-bdwgc (+0x9df7b) (0x093bf680)  ;
<mono-2.0-bdwgc.mono_object_new_specific_checked>
  32 0x01e72083 in mono-2.0-bdwgc (+0xa2082) (0x093bf6c8)  ;
<mono-2.0-bdwgc.ves_icall_object_new_specific>
  33 0x003f1021 (0x093bf718)                               ; Mono JIT Code
  34 0x08c2fd80 (0x093bf748)                               ; Mono JIT Code
  35 0x08ed3e80 (0x093bf778)                               ; Mono JIT Code
  36 0x08ed3b38 (0x093bf808)                               ; Mono JIT Code
  37 0x08ed391b (0x093bf858)                               ; Mono JIT Code
  38 0x08ee4199 (0x093bf8f8)                               ; Mono JIT Code
  39 0x08edc030 (0x093bf958)                               ; Mono JIT Code
  40 0x08edba81 (0x093bf988)                               ; Mono JIT Code
  41 0x08edb85c (0x093bf9e8)                               ; Mono JIT Code
  42 0x08c29e4c (0x093bfa2c)                               ; Mono JIT Code
  43 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x093bfa6c) ;
<mono-2.0-bdwgc.mono_jit_runtime_invoke>
  44 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x093bfa94)  ;
<mono-2.0-bdwgc.do_runtime_invoke>
  45 0x01e6fb9b in mono-2.0-bdwgc (+0x9fb9a) (0x093bfae4)  ;
<mono-2.0-bdwgc.mono_runtime_try_invoke_array>
  46 0x01e6c73f in mono-2.0-bdwgc (+0x9c73e) (0x093bfb4c)  ;
<mono-2.0-bdwgc.mono_message_invoke>
  47 0x01e71e63 in mono-2.0-bdwgc (+0xa1e62) (0x093bfba8)  ;
<mono-2.0-bdwgc.ves_icall_System_Runtime_Remoting_Messaging_AsyncResult_Invoke>
  48 0x08e6c948 (0x093bfbf8)                               ; Mono JIT Code
  49 0x08e6c89c (0x093bfc18)                               ; Mono JIT Code
  50 0x08d4a1f8 (0x093bfcb8)                               ; Mono JIT Code
  51 0x08d49c8c (0x093bfcf8)                               ; Mono JIT Code
  52 0x0711026d (0x093bfd44)                               ; Mono JIT Code
  53 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x093bfd84) ;
<mono-2.0-bdwgc.mono_jit_runtime_invoke>
  54 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x093bfdac)  ;
<mono-2.0-bdwgc.do_runtime_invoke>
  55 0x01e879f1 in mono-2.0-bdwgc (+0xb79f0) (0x093bfddc)  ;
<mono-2.0-bdwgc.try_invoke_perform_wait_callback>
  56 0x01e8822c in mono-2.0-bdwgc (+0xb822b) (0x093bfe48)  ;
<mono-2.0-bdwgc.worker_callback>
  57 0x01e8a791 in mono-2.0-bdwgc (+0xba790) (0x093bfe70)  ;
<mono-2.0-bdwgc.worker_thread>
  58 0x01e85431 in mono-2.0-bdwgc (+0xb5430) (0x093bfecc)  ;
<mono-2.0-bdwgc.start_wrapper_internal>
  59 0x01e8521c in mono-2.0-bdwgc (+0xb521b) (0x093bfedc)  ;
<mono-2.0-bdwgc.start_wrapper>
  60 0x7bc97970 call_thread_func_wrapper+0xb() in ntdll (0x093bfeec)
  61 0x7bc979d6 call_thread_func+0x63()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/signal_i386.c:2654] in ntdll
(0x093bffdc)
  62 0x7bc97962 call_thread_entry+0x9() in ntdll (0x093bffec) 
--- snip ---

https://github.com/mono/mono/blob/c5b88ec4f323f2bdb7c7d0a595ece28dae66579c/libgc/win32_threads.c#L424

--- snip ---
void GC_push_all_stacks()
{
  DWORD thread_id = GetCurrentThreadId();
  GC_bool found_me = FALSE;
  int i;
  int dummy;
  ptr_t sp, stack_min;
  GC_thread thread;
  LONG my_max = GC_get_max_thread_index();

  for (i = 0; i <= my_max; i++) {
    thread = thread_table + i;
    if (thread -> in_use && thread -> stack_base) {
      if (thread -> id == thread_id) {
    sp = (ptr_t) &dummy;
    found_me = TRUE;
      } else {
        CONTEXT context;
        context.ContextFlags = CONTEXT_INTEGER|CONTEXT_CONTROL;
        if (!GetThreadContext(thread_table[i].handle, &context))
ABORT("GetThreadContext failed");
...
--- snip ---

Offending managed thread with late suspension:

--- snip ---
Backtracing for thread 0056 in process 0008 (C:\Program Files\Wizards of the
Coast\MTGA\MTGA.exe):
Backtrace:
=>0 0xf7f8c049 __kernel_vsyscall+0x9() in [vdso].so (0x0a65f5d8)
  1 0xf7d599c7 __close+0x56() in libpthread.so.0 (0x0a65f5d8)
  2 0x7bc76adc close_handle+0x8d()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/om.c:399] in ntdll
(0x0a65f5d8)
  3 0x7bc76bf0 NtClose+0x1c()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/om.c:436] in ntdll
(0x0a65f5f8)
  4 0x7b47298c CloseHandle+0x104()
[/home/focht/projects/wine/mainline-src/dlls/kernel32/process.c:3352] in
kernel32 (0x0a65f678)
  5 0x01ebe93d in mono-2.0-bdwgc (+0xee93c) (0x0a65f694)             ;
CloseHandle
  6 0x01e1bef6 in mono-2.0-bdwgc (+0x4bef5) (0x0a65f6a8)             ;
mono-2.0-bdwgc.mono_w32file_close+1D
  7 0x071251d8 (0x0a65f6f8)                                          ;
mono-2.0-bdwgc.ves_icall_System_IO_MonoIO_Close+16
  8 0x003fcd38 (0x0a65f758)                                          ; Mono JIT
Code
  9 0x0712517c (0x0a65f778)                                          ; Mono JIT
Code
  10 0x08fe66e4 (0x0a65f798)                                         ; Mono JIT
Code
  11 0x090051bf (0x0a65f858)                                         ; Mono JIT
Code
  12 0x090045a8 (0x0a65f8f8)                                         ; Mono JIT
Code
  13 0x08ffd4c0 (0x0a65f958)                                         ; Mono JIT
Code
  14 0x08ffcfc1 (0x0a65f988)                                         ; Mono JIT
Code
  15 0x08ffcd9c (0x0a65f9e8)                                         ; Mono JIT
Code
  16 0x08839e4c (0x0a65fa2c)                                         ; Mono JIT
Code
  17 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x0a65fa6c)           ;
mono-2.0-bdwgc.retry+9
  18 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x0a65fa94)            ;
mono-2.0-bdwgc.mono_jit_runtime_invoke+2FF
  19 0x01e6fb9b in mono-2.0-bdwgc (+0x9fb9a) (0x0a65fae4)            ;
mono-2.0-bdwgc.do_runtime_invoke+57
  20 0x01e6c73f in mono-2.0-bdwgc (+0x9c73e) (0x0a65fb4c)            ;
mono-2.0-bdwgc.mono_runtime_try_invoke_array+38B
  21 0x01e71e63 in mono-2.0-bdwgc (+0xa1e62) (0x0a65fba8)            ;
mono-2.0-bdwgc.mono_message_invoke+1EF
  22 0x0885d878 (0x0a65fbf8)                                         ;
mono-2.0-bdwgc.ves_icall_System_Runtime_Remoting_Messaging_AsyncResult_Invoke+B3
  23 0x0885d7cc (0x0a65fc18)                                         ; Mono JIT
Code
  24 0x0884a7a8 (0x0a65fcb8)                                         ; Mono JIT
Code
  25 0x0884a23c (0x0a65fcf8)                                         ; Mono JIT
Code
  26 0x07110685 (0x0a65fd44)                                         ; Mono JIT
Code                       
  27 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x0a65fd84)           ;
mono-2.0-bdwgc.retry+9
  28 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x0a65fdac)            ;
mono-2.0-bdwgc.mono_jit_runtime_invoke+2FF
  29 0x01e879f1 in mono-2.0-bdwgc (+0xb79f0) (0x0a65fddc)            ;
mono-2.0-bdwgc.do_runtime_invoke+57
  30 0x01e8822c in mono-2.0-bdwgc (+0xb822b) (0x0a65fe48)            ;
mono-2.0-bdwgc.try_invoke_perform_wait_callback+41
  31 0x01e8a791 in mono-2.0-bdwgc (+0xba790) (0x0a65fe70)            ;
mono-2.0-bdwgc.worker_callback+31C
  32 0x01e85431 in mono-2.0-bdwgc (+0xb5430) (0x0a65fecc)            ;
mono-2.0-bdwgc.worker_thread+241
  33 0x01e8521c in mono-2.0-bdwgc (+0xb521b) (0x0a65fedc)            ;
mono-2.0-bdwgc.start_wrapper_internal+1F1
  34 0x7bc979e0 call_thread_func_wrapper+0xb() in ntdll (0x0a65feec) ;
mono-2.0-bdwgc.start_wrapper+3C
  35 0x7bc97a46 call_thread_func+0x63()
[/home/focht/projects/wine/mainline-src/dlls/ntdll/signal_i386.c:2654] in ntdll
(0x0a65ffdc)
  36 0x7bc979d2 call_thread_entry+0x9() in ntdll (0x0a65ffec) 
--- snip ---

Trace log, showing client/server side interaction, with additional diagnostics
added.

NOTE: Any kind of tracing/debugging causes interference (timing) and makes it
even harder to reproduce.

--- snip ---
$ pwd
/home/focht/.wine/drive_c/Program Files (x86)/Wizards of the Coast/MTGA

# clean download cache to force re-download
$ rm -rf MTGA_Data/Downloads/

$ WINEDEBUG=+thread,+loaddll,+process,+msgbox,+server wine ./MTGA.exe >>log.txt
2>&1
...
0055: new_thread( process=ffffffff, access=001fffff, suspend=1, request_fd=184,
objattr={} )
0055: new_thread() = 0 { tid=0056, handle=03e0 }
0055:fixme:thread:RtlCreateUserThread new thread -> tid=0x56, handle=0x3e0,
entry=0x1e851e0
0055: resume_thread( handle=03e0 )
0055: resume_thread() = 0 { count=1 }
0056: *fd* 192 <- 253
0055: close_handle( handle=03e0 )
0055: close_handle() = 0
0056: *fd* 194 <- 254
0056: init_thread( unix_pid=3804, unix_tid=3884, debug_level=1, teb=81f58000,
entry=01e851e0, reply_fd=192, wait_fd=194, cpu=x86 )
0056: init_thread() = 0 { pid=0008, tid=0056, server_start=1d458de30de85cc
(-3.5809720), info_size=0, version=572, all_cpus=00000001, suspend=0 }
0055: select( flags=2, cookie=092bf70c, timeout=infinite, prev_apc=0000,
result={}, data={WAIT,handles={03dc}} )
0055: select() = PENDING { timeout=infinite, call={APC_NONE}, apc_handle=0000 } 
...
0056: create_file( access=80100080, sharing=00000001, create=1,
options=00000060, attrs=00000080,
objattr={rootdir=0000,attributes=00000040,sd={},name=L""},
filename="/home/focht/.wine/dosdevices/c:/Program Files/Wizards of the
Coast/MTGA/MTGA_Data/Downloads/Audio/TEMP__BAT_DAR_0f5927532656771f5592ba8b19fc4e51.mtga"
)
0056: create_file() = 0 { handle=1510 }
0056: get_handle_fd( handle=1510 )
0056: *fd* 1510 -> 54
0056: get_handle_fd() = 0 { type=1, cacheable=1, access=00120089,
options=00000060 }
0056: create_file( access=c0100080, sharing=00000000, create=5,
options=00000060, attrs=00000080,
objattr={rootdir=0000,attributes=00000040,sd={},name=L""},
filename="/home/focht/.wine/dosdevices/c:/Program Files/Wizards of the
Coast/MTGA/MTGA_Data/Downloads/Audio/BAT_DAR_0f5927532656771f5592ba8b19fc4e51.mtga"
)
0056: create_file() = 0 { handle=1514 }
0056: get_handle_fd( handle=1514 )
0056: *fd* 1514 -> 55
0056: get_handle_fd() = 0 { type=1, cacheable=1, access=0012019f,
options=00000060 }
0056: select( flags=0, cookie=0a65f0dc, timeout=+5.0000000, prev_apc=0000,
result={}, data={WAIT,handles={103c}} )
0056: select() = 0 { timeout=1d458de4db5b238 (+5.0000000), call={APC_NONE},
apc_handle=0000 }
0056: release_semaphore( handle=103c, count=00000001 )
0056: release_semaphore() = 0 { prev_count=00000000 }
...
0062: get_thread_info( handle=03ec, tid_in=0000 )
0062: get_thread_info() = 0 { pid=0008, tid=0056, teb=81f58000,
entry_point=01e851e0, affinity=000000ff, exit_code=259, priority=0, last=0 }
0055: select( flags=2, cookie=81f5f16c, timeout=0, prev_apc=0000, result={},
data={} )
0055: select() = PENDING { timeout=1d458de4acd8910 (+0.0000000),
call={APC_NONE}, apc_handle=0000 }
0062: suspend_thread( handle=03ec )
0056: *sent signal* signal=10
0062: suspend_thread() = 0 { count=0 } 
...
0062: get_thread_context( handle=012c, flags=00000003, suspend=1 )
0062: get_thread_context() = 0 { self=0,
context={cpu=x86,eip=f7f8c047,esp=0033f590,ebp=0033f5f8,eflags=00200286,cs=0023,ss=002b,eax=00000003,ebx=00000007,ecx=0033f5d8,edx=00000010,esi=f7d65000,edi=00000000}
}
0062: get_thread_context( handle=0170, flags=00000003, suspend=1 )
0062: get_thread_context() = 0 { self=0,
context={cpu=x86,eip=f7f8c047,esp=0267f930,ebp=0267f998,eflags=00000286,cs=0023,ss=002b,eax=00000003,ebx=00000011,ecx=0267f978,edx=00000010,esi=f7d65000,edi=00000000}
}
0062: get_thread_context( handle=0344, flags=00000003, suspend=1 )
0062: get_thread_context() = 0 { self=0,
context={cpu=x86,eip=f7f8c047,esp=0944f9f0,ebp=0944fa58,eflags=00000286,cs=0023,ss=002b,eax=00000003,ebx=000000a7,ecx=0944fa38,edx=00000010,esi=f7d65000,edi=00000000}
}
0062: get_thread_context( handle=0398, flags=00000003, suspend=1 )
0062: get_thread_context() = 0 { self=0,
context={cpu=x86,eip=f7f8c047,esp=08e8f4c0,ebp=08e8f528,eflags=00000282,cs=0023,ss=002b,eax=00000003,ebx=000000b5,ecx=08e8f508,edx=00000010,esi=f7d65000,edi=00000000}
}
0062: get_thread_context( handle=03b8, flags=00000003, suspend=1 )
0062: get_thread_context() = 0 { self=0,
context={cpu=x86,eip=f7f8c047,esp=08f9fab0,ebp=08f9fb18,eflags=00000282,cs=0023,ss=002b,eax=00000003,ebx=000000b9,ecx=08f9faf8,edx=00000010,esi=f7d65000,edi=00000000}
}
0062: get_thread_context( handle=03d0, flags=00000003, suspend=1 )
0062: get_thread_context() = 0 { self=0,
context={cpu=x86,eip=f7f8c047,esp=092bfa60,ebp=092bfac8,eflags=00000286,cs=0023,ss=002b,eax=00000003,ebx=000000be,ecx=092bfaa8,edx=00000010,esi=f7d65000,edi=00000000}
}
0062: get_thread_context( handle=03ec, flags=00000003, suspend=1 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 0, handle=0x3ec
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 1, handle=0x3ec
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 2, handle=0x3ec
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} } 
...
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 96, handle=0x3ec
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 97, handle=0x3ec
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 98, handle=0x3ec
0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 )
0062: get_thread_context() = PENDING { self=0, context={} }
0062:fixme:thread:get_thread_context **STATUS_PENDING** try 99, handle=0x3ec
0062:fixme:thread:get_thread_context ***STATUS_PENDING*** exceeded,
handle=0x3ec
0062: get_desktop_window( force=0 )
0062: get_desktop_window() = 0 { top_window=00010020, msg_window=00010026 }
...
0056: set_suspend_context(
context={cpu=x86,eip=f7f8c049,esp=0a65f400,ebp=0a65f5d8,eflags=00000286,cs=0023,ss=002b,ds=002b,es=002b,fs=0063,gs=006b,eax=00000000,ebx=000000fb,ecx=00000002,edx=00000000,esi=f7d65000,edi=00000000,dr0=00000000,dr1=00000000,dr2=00000000,dr3=00000000,dr6=00000000,dr7=00000000,fp.ctrl=ffff027f,fp.status=ffff0120,fp.tag=ffffffff,fp.err_off=0884b3cb,fp.err_sel=00000023,fp.data_off=0a65fbb0,fp.data_sel=ffff002b,fp.cr0npx=00000120,fp.reg0=-128,fp.reg1=0,fp.reg2=1,fp.reg3=11,fp.reg4=0.232558,fp.reg5=1,fp.reg6=16,fp.reg7=12.5871,extended={...}}
)
0056: set_suspend_context() = 0
0056: select( flags=2, cookie=81f5b16c, timeout=0, prev_apc=0000, result={},
data={} )
0056: select() = PENDING { timeout=1d458de4ae06a4e (+0.0000000),
call={APC_NONE}, apc_handle=0000 }  
--- snip ---

The thread suspension happens - but very late. Too late for
get_thread_context() to succeed. I ran some additional timing/statistics with a
small patch applied, showing the outliers:

--- snip ---
$ WINEDEBUG=+thread,+timestamp wine ./MTGA.exe
...
177.921:005e:fixme:thread:get_thread_context **STATUS_PENDING** try 1,
handle=0x550
...
177.936:005e:fixme:thread:get_thread_context **STATUS_PENDING** try 14,
handle=0x550
...
214.062:0062:fixme:thread:get_thread_context **STATUS_PENDING** try 1,
handle=0x52c
...
215.253:0054:fixme:thread:get_thread_context **STATUS_PENDING** try 153,
handle=0x52c
...
272.157:0062:fixme:thread:get_thread_context **STATUS_PENDING** try 1,
handle=0x3d8
...
272.221:0062:fixme:thread:get_thread_context **STATUS_PENDING** try 49,
handle=0x3d8
...
324.818:0057:fixme:thread:get_thread_context **STATUS_PENDING** try 1,
handle=0x4ac
...
325.047:0057:fixme:thread:get_thread_context **STATUS_PENDING** try 208,
handle=0x4ac 
--- snip ---

There are some very rare outliers that go into range of 150ms-250ms.
Unfortunately it's not really possible to figure out what kind of managed code
is executed at that time. As soon as a native/managed debugger is active it
will always end up in final suspended state at the next interruption
checkpoint. There is no "peek at running code".
Although Mono itself has many interruption checkpoints on its own to manage
thread suspension at managed code level there are likely some code sequences
that run largely uninterruptable (no API callouts that end up in syscalls).
Wine 's thread context retrieval mechanism relies on timely SIGUSR1 delivery to
the target thread which in turn calls wait_suspend() to provide a suspend
context on the server side.

I increased the wait time for thread suspension a bit more (loop 500 times ->
500ms).

https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/ntdll/thread.c#l915

--- snip ---
 915 NTSTATUS get_thread_context( HANDLE handle, context_t *context, unsigned
int flags, BOOL *self )
 916 {
 917     NTSTATUS ret;
 918     DWORD dummy, i;
 919 
 920     SERVER_START_REQ( get_thread_context )
 921     {
 922         req->handle  = wine_server_obj_handle( handle );
 923         req->flags   = flags;
 924         req->suspend = 1;
 925         wine_server_set_reply( req, context, sizeof(*context) );
 926         ret = wine_server_call( req );
 927         *self = reply->self;
 928     }
 929     SERVER_END_REQ;
 930 
 931     if (ret == STATUS_PENDING)
 932     {
 933         for (i = 0; i < 100; i++)
 934         {
 935             SERVER_START_REQ( get_thread_context )
 936             {
 937                 req->handle  = wine_server_obj_handle( handle );
 938                 req->flags   = flags;
 939                 req->suspend = 0;
 940                 wine_server_set_reply( req, context, sizeof(*context) );
 941                 ret = wine_server_call( req );
 942             }
 943             SERVER_END_REQ;
 944             if (ret == STATUS_PENDING)
 945             {
 946                 LARGE_INTEGER timeout;
 947                 timeout.QuadPart = -10000;
 948                 NtDelayExecution( FALSE, &timeout );
 949             }
 950             else break;
 951         }
 952         NtResumeThread( handle, &dummy );
 953         if (ret == STATUS_PENDING) ret = STATUS_ACCESS_DENIED;
 954     }
 955     return ret;
 956 }
--- snip ---

It helped the app, I didn't encounter a single failure anymore. I also put a
WARN hint as developer diagnostics if the status was still STATUS_PENDING out
of the loop to indicate something is fishy (thread has no interruption
checkpoints?).

Again: the root cause why thread suspension in GC "world-stop collection" case
sometimes takes very long is unknown (within Mono managed code).

$ sha1sum MTGAInstaller.exe 
bdc80950b31353e664f48154a9c22621d93b0737  MTGAInstaller.exe

$ du -sh MTGAInstaller.exe 
3.7M    MTGAInstaller.exe

$ wine --version
wine-3.17

Regards

-- 
Do not reply to this email, post in Bugzilla using the
above URL to reply.
You are receiving this mail because:
You are watching all bug changes.



More information about the wine-bugs mailing list