With GitLab and the spinning off of commit messages to wine-gitlab I
think the traffic estimates on the mailing list page need to be
updated... at least for wine-devel: it no longer has 50 messages per
day.
https://www.winehq.org/forums
(unfortunately there is no unit on the little per mailing-list graph)
--
Francois Gouget <fgouget(a)free.fr> http://fgouget.free.fr/
Theory is where you know everything but nothing works.
Practice is where everything works but nobody knows why.
Sometimes they go hand in hand: nothing works and nobody knows why.
Binary packages for various distributions will be available from:
https://www.winehq.org/download
Summary since last release
* Rebased to current wine 9.0 (505 patches are applied to wine vanilla)
Upstreamed (Either directly from staging or fixed with a similar patch).
* None
Added:
* None
Updated:
* vkd3d-latest
Where can you help
* Run Steam/Battle.net/GOG/UPlay/Epic
* Test your favorite game.
* Test your favorite applications.
* Improve staging patches and get them accepted upstream.
* Suggest patches to be included in staging.
As always, if you find a bug, please report it via
https://bugs.winehq.org
Best Regards
Alistair.
context: https://bugs.winehq.org/show_bug.cgi?id=55897
While implementing CopyFile2, I noticed that CopyFileEx does not respect the progress callback and never calls it. The tests too assume that it will not be invoked other than in the case of a file having multiple streams, which is not something that Wine supports.
There seem to have been multiple efforts to implement this functionality (in 2022 and 2013) but I can't tell find any reasoning for why they were not successful. Would implementing this be a desirable contribution?
I don't know how to get assigned to an issue on Bugzilla. I read the Wine Developers Wiki but it doesn't seem to have the information, so I left a comment in the bug log itself.
This patch series introduces a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuils, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. For those not already familiar with the Windows
functions (or their undocumented behaviour), patch 29/29 provides a detailed
specification, and individual patches also include a brief description of the
API they are implementing.
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync5
== Implementation ==
Some aspects of the implementation may deserve particular comment:
* In the interest of performance, each object is governed only by a single
spinlock. However, NTSYNC_IOC_WAIT_ALL requires that the state of multiple
objects be changed as a single atomic operation. In order to achieve this, we
first take a device-wide lock ("wait_all_lock") any time we are going to lock
more than one object at a time.
The maximum number of objects that can be used in a vectored wait, and
therefore the maximum that can be locked simultaneously, is 64. This number is
NT's own limit.
The acquisition of multiple spinlocks will degrade performance. This is a
conscious choice, however. Wait-for-all is known to be a very rare operation
in practice, especially with counts that approach the maximum, and it is the
intent of the ntsync driver to optimize wait-for-any at the expense of
wait-for-all as much as possible.
* NT mutexes are tied to their threads on an OS level, and the kernel includes
builtin support for "robust" mutexes. In order to keep the ntsync driver
self-contained and avoid touching more code than necessary, it does not hook
into task exit nor use pids.
Instead, the user space emulator is expected to manage thread IDs and pass
them as an argument to any relevant functions; this is the "owner" field of
ntsync_wait_args and ntsync_mutex_args.
When the emulator detects that a thread dies, it should therefore call
NTSYNC_IOC_KILL_OWNER, which will mark mutexes owned by that thread (if any)
as abandoned.
* This implementation uses a misc device mostly because it seemed like the
simplest and least obtrusive option.
Besides simplicitly of implementation, the only particularly interesting
advantage is the ability to create an arbitrary number of "contexts"
(corresponding to Windows virtual machines) which are self-contained and
shareable across multiple processes; this maps nicely to file descriptions
(i.e. struct file). This is not impossible with syscalls of course but would
require an extra argument.
On the other hand, there is no reason to forbid using ntsync by default from
user-mode processes, and (as far as I understand) to do so with a char device
requires explicit configuration by e.g. udev or init. Since this is done with
e.g. fuse, I assume this is the model to follow, but I may have chosen
something deprecated.
* ntsync is module-capable mostly because there was nothing preventing it, and
because it aided development. It is not a hard requirement, though.
== Previous versions ==
Changes in v2:
* Send the whole series instead of just the first few patches.
* Try to add more description to each patch, as a short documentation of the
functions to be implemented. A more complete documentation of all aspects of
the driver is provided in the contents of the last patch.
* Objects are now files rather than indices into a table. This prevents a
process from changing the state of an object which it should not have access
to. Suggested by Andy Lutorminski.
* Because the device no longer inherently has a table of all objects, marking a
thread's owned mutexes as abandoned is now done through an ioctl on the mutex.
* Change the names of a couple ioctls to be a bit less odd (PUT_SEM -> SEM_POST,
PUT_MUTEX -> MUTEX_UNLOCK), and to reflect that they are ioctls on an object
rather than on the device.
* Pass the timeout for wait functions as a bare u64 (in ns), per Arnd Bergmann,
with U64_MAX used to indicate no timeout. I originally indicated that I would
change the timeout to be relative, but on reflection ended up keeping it as
absolute, as this results in the least number of calls to get the current time
(i.e. one).
* Use compat_ptr_ioctl(), per Arnd Bergmann.
* Remove the fixed minor number and module alias, per Greg Kroah-Hartman.
* Allocate the fds array on stack in setup_wait(). This array takes up 260
bytes.
* Link to v1: https://lore.kernel.org/lkml/[email protected]/
Elizabeth Figura (29):
ntsync: Introduce the ntsync driver and character device.
ntsync: Introduce NTSYNC_IOC_CREATE_SEM.
ntsync: Introduce NTSYNC_IOC_SEM_POST.
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
ntsync: Introduce NTSYNC_IOC_EVENT_SET.
ntsync: Introduce NTSYNC_IOC_EVENT_RESET.
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
ntsync: Introduce NTSYNC_IOC_SEM_READ.
ntsync: Introduce NTSYNC_IOC_MUTEX_READ.
ntsync: Introduce NTSYNC_IOC_EVENT_READ.
ntsync: Introduce alertable waits.
selftests: ntsync: Add some tests for semaphore state.
selftests: ntsync: Add some tests for mutex state.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for manual-reset event state.
selftests: ntsync: Add some tests for auto-reset event state.
selftests: ntsync: Add some tests for wakeup signaling with events.
selftests: ntsync: Add tests for alertable waits.
selftests: ntsync: Add some tests for wakeup signaling via alerts.
maintainers: Add an entry for ntsync.
docs: ntsync: Add documentation for the ntsync uAPI.
Documentation/userspace-api/index.rst | 1 +
.../userspace-api/ioctl/ioctl-number.rst | 2 +
Documentation/userspace-api/ntsync.rst | 390 +++++
MAINTAINERS | 9 +
drivers/misc/Kconfig | 9 +
drivers/misc/Makefile | 1 +
drivers/misc/ntsync.c | 1132 ++++++++++++++
include/uapi/linux/ntsync.h | 58 +
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/drivers/ntsync/Makefile | 8 +
tools/testing/selftests/drivers/ntsync/config | 1 +
.../testing/selftests/drivers/ntsync/ntsync.c | 1300 +++++++++++++++++
12 files changed, 2912 insertions(+)
create mode 100644 Documentation/userspace-api/ntsync.rst
create mode 100644 drivers/misc/ntsync.c
create mode 100644 include/uapi/linux/ntsync.h
create mode 100644 tools/testing/selftests/drivers/ntsync/Makefile
create mode 100644 tools/testing/selftests/drivers/ntsync/config
create mode 100644 tools/testing/selftests/drivers/ntsync/ntsync.c
--
2.43.0
Folks,
As you may have noticed, we haven't been making any plans for WineConf
this year. Jeremy has been busy preparing his retirement, and I haven't
been pushing it either, mostly because I'm not convinced that we want to
continue the traditional WineConf model.
Even though we skipped a few years because of the pandemic, attendance
at last year's WineConf wasn't great. We also suggested meeting at
FOSDEM in February, like we did in previous years, but essentially no
one showed up.
So I'm wondering whether there is still enough interest for a
traditional WineConf, or whether we should try a different approach, to
maybe capture some of the recent excitement around gaming and downstream
uses of Wine in general.
I'd like to hear your thoughts. Should we do a Proton conference, or
join some kind of gaming-related event? Do people even want to travel
to conferences anymore? What kind of event would you be interested in,
particularly those of you who don't show up to the traditional WineConf?
--
Alexandre Julliard
julliard(a)winehq.org
Binary packages for various distributions will be available from:
https://www.winehq.org/download
Summary since last release
* Rebased to current wine 9.1 (508 patches are applied to wine vanilla)
Upstreamed (Either directly from staging or fixed with a similar patch).
* include: Add more D3D_FEATURE_LEVEL_ defines.
* oleaut32: Do not reimplement OleLoadPicture in OleLoadPicturePath.
* oleaut32: Factor out stream creation from OleLoadPicturePath.
* oleaut32: Implement OleLoadPictureFile. (v2)
* user32/tests: Add tests for clicking through layered window.
* user32/tests: Add tests for window region of layered windows.
Added:
* [50148] msi: Process cabinet files only when one is supplied.
* [51965] msxml3: IMXWrite::output to support DOMDocument.
* [52128] scrrun: Implement IFileSystem3 MoveFolder.
Updated:
* vkd3d-latest
* user32-Mouse_Message_Hwnd (Rebased and enabled)
* wined3d-bindless-texture (Rebased and enabled)
* ddraw-version-check (Rebased and enabled)
* vcomp_for_dynamic_init_i8
Where can you help
* Run Steam/Battle.net/GOG/UPlay/Epic
* Test your favorite game.
* Test your favorite applications.
* Improve staging patches and get them accepted upstream.
* Suggest patches to be included in staging.
As always, if you find a bug, please report it via
https://bugs.winehq.org
Best Regards
Alistair.
Hello,
I just updated the vkd3d-latest staging patchset with the latest vkd3d
commit (attached below)
Hopefully someone can push this to wine-staging
Thanks,
Aida/Echo (DodoGTA)
This patch series introduces a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuills, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
This is the first part of a 32-patch series. The series comprises 17 patches
which contain the actual implementation, 13 which provide self-tests, 1 to
update the MAINTAINERS file, and 1 to add API documentation.
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. Since I do not expect familiarity with Windows
syscalls, however, and especially not with some of the more subtle or
unspecified behaviour that they provide, the documentation patch included in the
series also describes the intended behaviour in detail, and can be used as a
specification for the rest of the series.
The entire series can be retrieved or browsed here:
https://repo.or.cz/linux/zf.git/shortlog/refs/heads/ntsync4
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync4
== Implementation ==
Some aspects of the implementation may deserve particular comment:
* In the interest of performance, each object is governed only by a single
spinlock. However, NTSYNC_IOC_WAIT_ALL requires that the state of multiple
objects be changed as a single atomic operation. In order to achieve this, we
first take a device-wide lock ("wait_all_lock") any time we are going to lock
more than one object at a time.
The maximum number of objects that can be used in a vectored wait, and
therefore the maximum that can be locked simultaneously, is 64. This number is
NT's own limit.
The acquisition of multiple spinlocks will degrade performance. This is a
conscious choice, however. Wait-for-all is known to be a very rare operation
in practice, especially with counts that approach the maximum, and it is the
intent of the ntsync driver to optimize the wait-for-any pattern at the
expense of the wait-for-all pattern as much as possible.
* NT mutexes are tied to their threads on an OS level, and the kernel includes
builtin support for "robust" mutexes. In order to keep the ntsync driver
self-contained and avoid touching more code than necessary, it does not hook
into task exit nor use pids.
Instead, the user space emulator is expected to manage thread IDs and pass
them as an argument to any relevant functions; this is the "owner" field of
ntsync_wait_args and ntsync_mutex_args.
When the emulator detects that a thread dies, it should therefore call
NTSYNC_IOC_KILL_OWNER, which will mark mutexes owned by that thread (if any)
as abandoned.
* This implementation uses a misc device mostly because it seemed like the
simplest and least obtrusive option.
Besides simplicitly of implementation, the only particularly interesting
advantage is the ability to create an arbitrary number of "contexts"
(corresponding to Windows virtual machines) which are self-contained and
shareable across multiple processes; this maps nicely to file descriptions
(i.e. struct file). This is not impossible with syscalls of course but would
require an extra argument.
On the other hand, there is no reason to forbid using ntsync by default from
user-mode processes, and (as far as I understand) to do so with a char device
requires explicit configuration by e.g. udev or init. Since this is done with
e.g. fuse, I assume this is the model to follow, but I may have chosen
something deprecated.
* ntsync is module-capable mostly because there was nothing preventing it, and
because it aided development. I am not aware of any reason why being a module
is required, though.
* The misc minor number has not been reserved with LANANA. I am not sure at what
point in the process this makes the most sense, but since this is still only
an RFC I've abstained from doing so yet.
Elizabeth Figura (9):
ntsync: Introduce the ntsync driver and character device.
ntsync: Reserve a minor device number and ioctl range.
ntsync: Introduce NTSYNC_IOC_CREATE_SEM and NTSYNC_IOC_DELETE.
ntsync: Introduce NTSYNC_IOC_PUT_SEM.
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_PUT_MUTEX.
ntsync: Introduce NTSYNC_IOC_KILL_OWNER.
Documentation/admin-guide/devices.txt | 3 +-
.../userspace-api/ioctl/ioctl-number.rst | 2 +
drivers/misc/Kconfig | 9 +
drivers/misc/Makefile | 1 +
drivers/misc/ntsync.c | 916 ++++++++++++++++++
include/linux/miscdevice.h | 1 +
include/uapi/linux/ntsync.h | 53 +
7 files changed, 984 insertions(+), 1 deletion(-)
create mode 100644 drivers/misc/ntsync.c
create mode 100644 include/uapi/linux/ntsync.h
base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
--
2.43.0
On Thursday, 25 January 2024 11:02:26 CST Arnd Bergmann wrote:
> On Wed, Jan 24, 2024, at 23:28, Elizabeth Figura wrote:
> > On Wednesday, 24 January 2024 13:52:52 CST Arnd Bergmann wrote:
> >> On Wed, Jan 24, 2024, at 19:02, Elizabeth Figura wrote:
> >> > That'd be nicer in general. I think there was some documentation that
> >> > advised using timespec64 for new ioctl interfaces but it may have been
> >> > outdated or misread.
> >>
> >> It's probably something I wrote. It depends a bit on
> >> whether you have an absolute or relative timeout. If
> >> the timeout is relative to the current time as I understand
> >> it is here, a 64-bit number seems more logical to me.
> >>
> >> For absolute times, I would usually use a __kernel_timespec,
> >> especially if it's CLOCK_REALTIME. In this case you would
> >> also need to specify the time domain.
> >
> > Currently the interface does pass it as an absolute time, with the
> > domain implicitly being MONOTONIC. This particular choice comes from
> > process/botching-up-ioctls.rst, which is admittedly focused around GPU
> > ioctls, but the rationale of having easily restartable ioctls applies
> > here too.
>
> Ok, I was thinking of Documentation/driver-api/ioctl.rst, which
> has similar recommendations.
>
> > (E.g. Wine does play games with signals, so we do want to be able to
> > interrupt arbitrary waits with EINTR. The "usual" fast path for ntsync
> > waits won't hit that, but we want to have it work.)
> >
> > On the other hand, if we can pass the timeout as relative, and write it
> > back on exit like ppoll() does [assuming that's not proscribed], that
> > would presumably be slightly better for performance.
>
> I've seen arguments go either way between absolute and relative
> times, just pick whatever works best for you here.
>
> > When writing the patch I just picked the recommended option, and didn't
> > bother doing any micro-optimizations afterward.
> >
> > What's the rationale for using timespec for absolute or written-back
> > timeouts, instead of dealing in ns directly? I'm afraid it's not
> > obvious to me.
>
> There is no hard rule either way, I mainly didn't like the
> indirect pointer to the timespec that you have here. For
> traditional unix-style interfaces, a timespec with CLOCK_REALTIME
> times usually makes sense since that is what user space is
> already using elsewhere, but you probably don't need to
> worry about that. In theory, the single u64 CLOCK_REALTIME
> nanoseconds have the problem of no longer working after year
> 2262, but with CLOCK_MONOTONIC that is not a concern anyway.
>
> Between embedding a __u64 nanosecond value and embedding
> a __kernel_timespec, I would pick whichever avoids converting
> a __u64 back into a timespec, as that is an expensive
> operation at least on 32-bit code.
Makes sense. I'll probably switch to using a relative and written-back u64
then, thanks!
On Thursday, 25 January 2024 10:47:49 CST Arnd Bergmann wrote:
> On Thu, Jan 25, 2024, at 04:42, Elizabeth Figura wrote:
> > On Wednesday, 24 January 2024 16:56:23 CST Elizabeth Figura wrote:
> >> On Wednesday, 24 January 2024 15:26:15 CST Andy Lutomirski wrote:
> >> > On Tue, Jan 23, 2024 at 4:59 PM Elizabeth Figura
<zfigura(a)codeweavers.com> wrote:
> >> [There is also a potential problem where some broken applications
> >> create a million (literally) sync objects. Making these into files runs
> >> into NOFILE. We did specifically push distributions and systemd to
> >> increase those limits because an older solution *did* use eventfds and
> >> *did* run into those limits. Since that push was successful I don't
> >> know if this is *actually* a concern anymore, but avoiding files is
> >> probably not a bad thing either.]
> >
> > Of course, looking at it from a kernel maintainer's perspective, it
> > wouldn't be insane to do this anyway. If we at some point do start to
> > care about cross- process isolation in this way, or if another NT
> > emulator wants to use this interface and does care about cross-process
> > isolation, it'll be necessary. At least it'd make sense to make them
> > separate files even if we don't implement granular permission handling
> > just yet.
>
> I can think of a few other possible benefits of going with
> per-mutex file descriptors:
>
> - being able to use poll() for waiting on them individually in
> combination with other file descriptor based events (socket,
> signalfd, pidfd, ...)
I can say for sure this isn't going to be useful for Wine, at least not with
the current design.
It also doesn't really mesh well with the NT design in the first place.
NTSYNC_IOC_WAIT_ANY differs from poll() in two major ways: it consumes state
of most object types, and (as coded here) it needs the owner thread ID to be
specifically passed for mutexes.
Anyway, as Alexandre has informed me I clearly have misunderstood our
requirements, so I'm going to try to put together something using files
instead.