locales, unicode and ansi with msvcrt (bug 8022)
Ann & Jason Edmeades
us at edmeades.me.uk
Wed Apr 18 15:45:51 CDT 2007
Firstly, fantastic work, and its also explained to me something Eric said
which I didn't grasp...
>While preparing tests I found the 'user' backend of wineconsole works with
>WriteConsole/WriteFile correct (see test1 and test2), so I used it as etal.
>I figured out the main issue with cmd:
>It passes strings to WriteFile in ANSI, but should in OEM.
>I think the main locale issue is found. CMD and XCOPY violate this.
>So it is not MSVCRT bug. Moreover, [w]fprintf MUST NOT perform any
>convertions (as test3 and test4 show).
>I made and attached a quick hack to demostrate that cmd was buggy. Attached
>screenshots show the difference. The patch heals almost everything, even
>localized filenames. So I'm CC'ing mail to wine-devel, console/cmd gurus
>must know much more ;-)
>I found some issues with wprintf, see README for test3 and test4.
>The attached tests are made with your patch applied.
Another interesting url is here, which confirms what you are saying
So, another round of discussion but things have moved forward a lot...
1. The underlying problem appears to be that the output from both programs
(and any other command line program) should not be done through msvcrt's i/o
functions or writeconsoleA/writefileA, it needs to be either converted to
OEM and then printf/writefile/writeconsole'd.
Ideally, a Unicode string would work better, which should be writeconsolew'd
/ writefile'd if that fails (eg if output is redirected to a file).
=> The only problem with this I can see is that this would result in
redirected output containing Unicode which is wrong. However, advice I found
on a URL on the web said this:
>>Tips and considerations:
>>* use WriteConsole to output Unicode strings. Note that this API works
>>only on console handles and can not be used for a redirection to a disk
>>* If the output is being redirected to a disk file, use WriteFile with
>>the current console code page that can be retrieved by
>>GetConsoleOutputCP (the console code page might be different from the
>>currently selected OEM code page!).
So I believe the output function in cmd.exe should end up (When fully
if this fails
Convert from wide to multibyte using consoleoutputpt
Writefile the result
Temporarily, since in cmd.exe we have an ANSI string in our hands, use
something like the mechanism you have coded using chartooem. Out of interest
since the string is in ansi, the msdn says CharToOEM(A) can convert in
place, so if you put CharToOem(message, message); just before the WriteFiles
(and remove the const qualifier), does this work?
2. The testcases I have previously confirm that msvcrt's functions are also
misfunctioning, and I strongly believe my current solution to those is
correct. Ie for applications using msvcrt wprintf functionality it needs to
take into account the mode the file was opened. (I don't know which other
msvcrt routines have similar impacts, but if this is accepted I'll try to
take a look)
=> Unless I get any negative comments soon I will tidy the tests up and
submit that as a patch
3. The right solution is that cmd.exe works in Unicode, which is an exercise
I plan to do as soon as I have finished work on the few remaining issues I
want to address (I want to look at attrib, for and copy, plus a few bugs I
have written on scraps of paper...)
4. xcopy needs a similar fix - If you are happy to do some tests (especially
xcopying files with russian names, plus copying directories created with
Russian names) I'll contact you directly with a patch to test for me
5. Once all the above is done, I'd like to check on your test3/4 cases to
see if there's any residual problems.
Again, thanks for your excellent work. I never thought I'd be so pleased to
see Russian characters on a screen...!
More information about the wine-devel