Wine and locales

Shachar Shemesh wine-devel at shemesh.biz
Wed Aug 4 03:20:57 CDT 2004


David Lee Lambert wrote:

>On Wed, Jul 28, 2004 at 10:13:15PM -0700, Alexandre Julliard wrote:
>  
>
>>"Dmitry Timoshkov" <dmitry at baikal.ru> writes:
>>
>>    
>>
>>>I like the idea of moving that setting to the config file. We can't
>>>use existing unix locale settings except LC_ALL and LANG because
>>>every user's system might have (and does have) very different locale
>>>settings, we can't assume that everyone out there configures locale
>>>in the same way.
>>>      
>>>
>>I don't see how the settings would be different, surely LC_CTYPE is
>>always going to control the ASCII->Unicode mapping on Unix, so why
>>shouldn't it do that on Wine?  If the issue is that users change their
>>setup without understanding the results, then surely adding even more
>>config parameters that they need to get right is not going to improve
>>the situation.
>>    
>>
>
>Actually,  there are a number of different locale-related things that Wine needs to 
>keep track of:
>
>1.  ANSI->Unicode translation for programs that use the ANSI calls,  as has been 
>discussed in this thread.
>  
>
Ok.

>2.  Unicode->codepage translation on standard output, and codepage->Unicode 
>translation on standard input.  Note that I could set LANG to 'en_US.UTF-16' on my 
>Linux system, and programs SHOULD accept this.  Most don't, however.
>  
>
Why should we set it differently than 1? In any case, I am not aware of 
UTF-16 being a compilable locale setting. Thus, it is not required that 
anyone support it.

>3.  Unicode->codepage->Unicode translation on Linux kernels before 2.4, whereafter 
>filenames are SUPPOSED TO be in UTF-8,  and kernel modules do translation for 
>filesystems where filenames are stored in some other charset. (OPTIONAL, as 
>filenames are not a big deal and the newer kernel fixes it--however,  there has to 
>be a converson from the short-per-character format to UTF-8).
>  
>
Name one such filesystem, please. EXT and reiser never cared, as far as 
I know. VFAT has to translate names stored in UTF-16. Are you saying the 
kernels<2.4 didn't have the "iocharset" option?

>4.  Selection of approriate language for strings in programs that use such 
>selection,
>
Discussed in this thread under the "GetDefaultUILanguage" API.

> as well as time, numeric, and string formats.
>
Also discussed in this thread.

>  This is all through 
>GetLocaleInfo(), whose first argument is an LCID returned by either 
>GetUserDefaultLocale or GetSystemDefaultLocale.
>  
>
You can also pass "LOCALE_SYSTEM_DEFAULT" instead, but that doesn't 
matter. In any case, there are "user overrides" here, which we may, one 
day, want to implement. Everything is laid out in the table that started 
this thread.

>5.  The MultiByteToWideChar() and WideCharToMultiByte() functions,  which allow a 
>program to do its own conversion to and from Unicode with a specified codepage.
>  
>
What do we need to do with these? They get an explicit codepage to 
convert to/from. Funny though it may sound, these functions are not 
affected by locale.

>I think (1) should be specified on a per-program basis in the config file, with a 
>system default there, and, as final default, raw translation for ANSI-to-Unicode and 
>something reasonable the other way.  I said in another message that codepages are 
>deprecated;  I meant that the ANSI calls (as opposed to (5)) are deprecated for 
>internationalized applications.
>  
>
I don't agree. Mixing default codepages across simultaneously running 
programs is not possible on Windows, and sounds horribly difficult to 
implement. Clipboard handling and cross-file using are two examples of 
things that are likely to go horribly wrong if we tried.

Having one setting applicable to all running processes sounds good 
enough. I don't object to a config setting overriding what LC_CTYPE 
says, but I don't see a use for it either.

>The '.codepage' suffix of LANG and LC_CTYPE should both be searched for the answer 
>to (2).  As for graphical output to X,  it doesn't seem like that should be 
>restricted by setting LANG.
>  
>
Again - why should it be different than 1?

>For (3) there should be an option in the config file like "filesystem_codepage", but 
>it should default to utf8.
>  
>
We should probably not bother, though. This "problem" is shared by every 
other Unix program running on the system, and solved the same way there 
- they use LC_CTYPE.

>For (4),  Wine should select an appropriate LangID and LCID based on the la_CC tag 
>and return them, respectively, in response to Get*DefaultLangID and Get*LCID.  In 
>wine, at present, there is not really a seperate 'system' level.
>
>Furthermore,  wine could respond to different groups of GetLocaleInfo() constants 
>according to LC_MESSAGES, LC_NUMERIC, etc., but this is an unusual feature that 
>probably isn't needed at first.  It seems that using the config-file to define 
>codepage translation and the suffix for IO charset translation gets rid of the 
>typical user's need to have other variables besides LANG set.
>
>Consider locales I might use:
>
>LANG	LCID	LangID
>----	----	------
>en_US	1	9
>es_MX	52	10
>es_US	1	9	
>ar_SA	966	1
>
>Let's say I have a program that prints "Hello, World" in the current language, using 
>wide calls.  When I run it in Linux,  it should print that string out using the 
>current language and codepage.  Suppose I also have a database program that was 
>written in outer burgoslavia and keeps its data files in the encoding for outer 
>burgoslavian,  which is supported only by Windows 95 for Burgoslavia and Windows 
>Server 2003.  I don't want to change Linux to support Burgoslavian,  but if 
>Burgoslavian is encoded in some Unicode font I can add a section to [AppDefaults]
>and let that perticular program think it is running on an all-Burgoslavian system.
>
>For (5), the functions act the same no matter what locale the user is in.
>
>  
>


-- 
Shachar Shemesh
Lingnu Open Source Consulting ltd.
http://www.lingnu.com/




More information about the wine-devel mailing list