WWN Issue 41

This is the 41st release of the Wine's kernel cousin publication. Its main goal is to distribute widely what's going on around Wine (the Un*x Windows emulator).

Wine 20000430 has been released. Main changes are:

This week, 161 posts consumed 444 K. There were 29 different contributors. 20 (68%) posted more than once. 11 (37%) posted last week too.

The top 5 posters of the week were:

28 posts in 150K by Patrik Stridvall
21 posts in 13K by Dimitrie O. Paun
13 posts in 28K by Alexandre Julliard
13 posts in 28K by Uwe Bonnes
13 posts in 33K by gerard patel

Improving wrc Archive

Improving wrc	Archive
Bertho Stultiens, while preparing for a new version of wrc (the Wine resource compiler), had some yet unanswered questions: According to what I found on the web are resources always little-endian because MS does not support/wrote OSes for big-endian processors. There are a couple of questions that go with this: Is it true that MS only has little-endian version? Should I support big-endian at all in wrc? Currently, wrc generates the native endianness of the platform, but it does _not_ convert binary resources (such as bitmaps). It is actually extremely difficult to mix endianness in resources because everything has to be examined and _cannot_ be guaranteed to be correct (such as RCDATA). Should wine only use little-endian in the resources? In my opinion, yes. Let the resources be the same all the time and let the resource-loader take care of conversion. There is a comment in a header about byte-swapping and wrc. I really would prefer to have byte-swapping in wine rather than wrc. Mainly because wine already requires to do the analysis of resource-contents, whereas wrc only packs data (without contextual/semantical knowledge). Bertho asked for feedback and also experiences natively running Wine on a big-endian CPU. Both Alexandre Julliard and Ulrich Weigand answered that all current NT versions run on little-endian only systems, so this question doesn't seem to have been addressed (it still remains open on Windows CE). Alexandre even made some sarcasm:The Windows headers contain a few #ifdef _MAC that attempt to add big-endian support (apparently using a generic `#ifdef BIG_ENDIAN` was a concept a bit too abstract for Microsoft) Ulrich went a bit further: I agree, resources should always be treated little-endian. At the most, we might think about making a distinction between the resource data itself and the 'meta data' surrounding it (resource directory, PE header links ...); it might be easier to have the latter in native byte ordering, especially in the case of the dummy PE headers created for Wine modules (these structures are completely internal between wrc and the Wine loader, so we can use whatever is easiest here, of course). Every 'external' format, be it .RES file or cursors/icons/etc. imported by or included in RC files, should IMO always be little-endian. The same applies to the raw resource data exchanged between app and Wine, e.g. when using a Create...Indirect routine. Ulrich gave also some feedback on his successful trials to run 'hello3' on Solaris (32 bit big endian) (even if he never sent the patches, because he never finished the clean up): I decided to have resource contents in little-endian, and meta data (resource directory) in native big-endian format, as this seemed to be the solution requiring the fewest Wine changes. The changes described in the following achieved this result. Major changes include reading and writing meta-data in wrc (doing some swapping when needed), as well as modifying reading of resources in Wine (same type of swapping). Ulrich also pointed out some less obvious modification to be made: another problem is in the handling of Unicode strings: wide characters are also endianness-sensitive, of course, so a simple lstrcpyWtoA doesn't do the right thing... and pe_resource.c routines don't work, as they rely on various bit-field structures to break out the 'resource name is string' and 'resource data is directory' bits. This doesn't work, as on Sparc bit-fields are allocated starting from the MSB down, not LSB up as on Intel :-/ Finally, Bertho announced he shall be sending a new wrc version later this week.

Bertho Stultiens, while preparing for a new version of wrc (the Wine resource compiler), had some yet unanswered questions: According to what I found on the web are resources always little-endian because MS does not support/wrote OSes for big-endian processors. There are a couple of questions that go with this:

Bertho asked for feedback and also experiences natively running Wine on a big-endian CPU.

Both Alexandre Julliard and Ulrich Weigand answered that all current NT versions run on little-endian only systems, so this question doesn't seem to have been addressed (it still remains open on Windows CE). Alexandre even made some sarcasm:The Windows headers contain a few #ifdef _MAC that attempt to add big-endian support (apparently using a generic #ifdef BIG_ENDIAN was a concept a bit too abstract for Microsoft)

Ulrich went a bit further: I agree, resources should always be treated little-endian.

At the most, we might think about making a distinction between the resource data itself and the 'meta data' surrounding it (resource directory, PE header links ...); it might be easier to have the latter in native byte ordering, especially in the case of the dummy PE headers created for Wine modules (these structures are completely internal between wrc and the Wine loader, so we can use whatever is easiest here, of course).

Every 'external' format, be it .RES file or cursors/icons/etc. imported by or included in RC files, should IMO always be little-endian. The same applies to the raw resource data exchanged between app and Wine, e.g. when using a Create...Indirect routine.

Ulrich gave also some feedback on his successful trials to run 'hello3' on Solaris (32 bit big endian) (even if he never sent the patches, because he never finished the clean up): I decided to have resource contents in little-endian, and meta data (resource directory) in native big-endian format, as this seemed to be the solution requiring the fewest Wine changes. The changes described in the following achieved this result.

Major changes include reading and writing meta-data in wrc (doing some swapping when needed), as well as modifying reading of resources in Wine (same type of swapping). Ulrich also pointed out some less obvious modification to be made: another problem is in the handling of Unicode strings: wide characters are also endianness-sensitive, of course, so a simple lstrcpyWtoA doesn't do the right thing... and pe_resource.c routines don't work, as they rely on various bit-field structures to break out the 'resource name is string' and 'resource data is directory' bits. This doesn't work, as on Sparc bit-fields are allocated starting from the MSB down, not LSB up as on Intel :-/

Finally, Bertho announced he shall be sending a new wrc version later this week.

Wine's license
After the previous events (see "shall we change?" and "vote for a change!" ) episodes), Alexandre Julliard changed the Wine license for the X11 one. Here's the terms of the new license: Copyright (c) 1993-2000 the Wine project authors (see the file AUTHORS for a complete list) Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Ansi and Unicode

Archive

Dimitrie Paun was kind of unhappy with Wine's current string support. As you may already know, most of 32 APIs come into two flavors: ANSI and Unicode. API suffixed with 'A' are ANSI, and the ones with 'W' are Unicode. Being ANSI (resp. Unicode) express how the function must handle any string input or output parameter. So, the same function, say CreateWindow, come in two flavors CreateWindowA and CreateWindowW.

Microsoft uses the same convention (a #define UNICODE triggers the Unicode mode at compile time).

ANSI means a one byte per character coding, whereas Unicode implies several bytes (at least two, but some are escapes to longer sequences). Even if Unicode consumes more memory, it also allows to store strings for various languages: most of non textual languages (Japanese, at least in Kanji or Chinese, most of cyrillic alphabets, as Russian... but also some other European languages, with specific diacritics).

Ove Kåven gave an overview of the different encodings:

Note: All the ISO Latin 1,2.... follow this scheme

In the rest of this article, W will refer to UTF-16 strings or functions, and U to UTF-8 strings or functions.

Currently, as Dimitrie points out, most of the Wine code is poorly written with regard to Unicode: most of the W functions convert the string into an ANSI one, and then call the A function, implying a loss of information, and some potential bugs.

Dimitrie proposed to change Wine's style for coding by providing a unique function (let's say suffixed by 'X') which would be the work horse for both A and W functions.

Dmitry Timoshkov didn't like this proposal, and rather suggested to Wine should have only one functional implementation indeed. I think, it should be implemented like in NT: all actual work does Unicode version, ANSI version simply converts ANSI to Unicode and then calls Unicode workhorse. But this transition will consume a lot of time and efforts.

Dimitrie Paun went further with: Somehow, I don't think working with W is the right thing to do in Unix. We have the following situation: we receive strings as arguments; their encoding is not explicit with every string, but rather is implicit by the entry point. Now, we can do two things:

Anyway, I like 2 better than 1. Not committing to an encoding early in the game is good -- sometimes we need UTF8 (file systems, X), in other cases we need UTF16 (pure Win stuff). Moreover, the thing is scalable -- if another encoding comes along, we could easily support it. And, on top of it all, it should be more efficient.

With lots of discussions and contributions from many people, the following table has been built:

	Description	Pros	Cons
1	W->A conversion, work internally with A	best option for debugging fast for A (common case today) use std. Win API	we do NOT support Unicode, we just pretend we do(1) a lot of work, a lot clutter, close to no gain. inefficient for the W case
2	A->W conversion, work internally with W	full Unicode support fast for W use of std. Win API part of Wine is already written this way	a lot of clutter very inefficient in the A case (A->W->U usually)(2)
3	A,W call onto a X function which carries the encoding around	full Unicode support as fast as 1 for A, and as 2 for W (for common code path like display) support for new encodings is trivial not much worse than 2 for debugging maybe a bit less clutter than in 1 or 2 (debatable) easy transition from what we have to this	use of non std. Win API: this doesn't work across DLLs (would require new APIs) it is not used in Wine currently test coverage of all possible paths can be huge
4	Write all functions independent of the encoding and recompile to get all encodings (same .c file would generate .Ansi.o, .w.o object files	fastest option for A, W easy to support future encodings use of std. API less clutter (in theory)	huge bloat it is not used in Wine currently (maybe) difficult transition path

Notes:

converting A->W->U for file I/O may seem wasteful but it isn't really since we need to support code pages; you can only do A->U directly for 7-bit ascii which is not enough. And supporting code pages without the Unicode step means N^2 conversion tables instead of 2*N (where N is the number of code pages).

Since Alexandre's preferred approach is #2, it was the chosen one. However, lots of arguments, mainly between Dimitrie Paun and Patrik Stridvall flooded wine-devel to such an extent that some readers thought they were reading linux-kernel mailing list.

Patrik also proposed to automate some of the A->W or W->A conversions so that stubs for some functions could be generated from the .spec file. This didn't work out as, because there are different options to take care of:

Semantics seemed too complex to really provide a robust framework. As a conclusion, Wine internal string encoding shall (slowly) shift from Ansi to Unicode (UTF-16).

World Wine News