locales, unicode and ansi with msvcrt (bug 8022)

Fri Apr 13 16:50:16 CDT 2007

> > >What your test app is doing? It probably needs a test under Windows 
> > >to see in which encoding (ANSI/OEM) a not unicode app should 
> > >receive input via a pipe.
> > 
> I meant things like 'dir >lst.txt', 'dir | sort > lst.txt'. 'dir' and 
> 'sort' could be replaced by some external .exes that get input and 
> produce outpup.
> 

Hiya,

I wrote an app which did ReadConsoleW and then traced out the hex of the
first character read in, and used ALT+157 as a mechanism to supply a
character which differs between the codepages I was playing with:

(All the following was under windows XP)
Code:
    ReadConsoleW(GetStdHandle(STD_INPUT_HANDLE), buf,
sizeof(buf)/sizeof(WCHAR), &nChars, NULL);
    printf("Character at position 0 is %x\n", buf[0]);

Results:
Active code page: 437 - Character at position 0 is a5
Active code page: 850 - Character at position 0 is d8
Active code page: 1252 - Character at position 0 is 9d

So I think its converting between the console codepage and Unicode, if I
interpret that correctly.

I then modified it to write out (**) unicode character 0xa5 to see if the
conversion is back to oem or ansi, and although its hard to prove beyond
doubt(*), it appears to me I am getting the reverse of that, it its
converted to the console codepage before being output..

(*) in cmd.exe if its not full screen, the font does not change when chcp is
executed, so for 437 and 850 I get an 0 type char and a yen. If I do it full
screen, both give me a yen, so I would concur from that the character
codepoint is changing and comes out depending on the font

(**) Because I want to test this with WriteConsoleW, this does not get
redirected to a file so I cant see the raw codepoints...

Anything else I can test, or am I ok to put file tests into msvcrt test
buckets and allow the msvcrt unicode printf and friends to convert to
non-unicode using the console codepage before being output to the file
handle?

Suggested tests welcome, but I was planning on using the unicode wide file
i/o functions, the opening in and confirming the bytes were as expected (If
I stick to a-z, 0-9 we will know if its come out with extra 0's)

Regards and thanks for your time,
Jason