[PATCH 1/4] kernel32: Support UTF-7 in MultiByteToWideChar.

Wed Dec 5 02:52:29 CST 2012

On Tue, Dec 04, 2012 at 08:30:55PM -0700, Alex Henrie wrote:
> 2012/12/4 Fr?d?ric Delanoy <frederic.delanoy at gmail.com>:
> > The above MSDN comment indicates pre-Vista versions are buggy, so it's
> > probably not a good idea to match that behaviour.
> 
> I think encoding and decoding in UTF-7 arbitrary binary data was
> considered a "feature" in Windows XP. As MSDN said, "Code written in
> earlier versions of Windows that rely on this behavior to encode
> random non-text binary data might run into problems." So I'm sure
> there's at least one application that depends on the data not being
> Unicode-normalized. Whoever adds normalization will have to make sure
> it's turned off in Windows XP (or older) mode.

Actually UTF-8 is a PITA - a program has to know whether every
individual C string (or file) is UTF-8 or 8bit ascii (well 8859-x).
Assuming UTF-8 doesn't work unless in can process all arbitrary
byte sequences (and write them back) - which the standard doesn't
allow for.

In the US it probably isn't often an issue, but in europe there are
mane files that have occaisional characters with the top bit set.
In the UK we only see 0xA3 (pound sterling) - but it can crop up
anywhere - and causes my mail program (which, for some reason I
don't understand) assumes UTF-8 do drop core responsing to mails!

	David

-- 
David Laight: david at l8s.co.uk