Is W really UTF-16?

Ove Kaaven ovehk at ping.uio.no
Wed Jan 9 15:07:14 CST 2002


On Wed, 9 Jan 2002, Bill Medland wrote:

> While I was working on the DrawText functions over the past many months I
> started wondering about when it would fail.  (I'm pedantic and such things
> fascinate me!).  The main concern I have is how to walk a W string
> correctly.  For example while "ellipsifying" text we will need to "move the
> pointer to the previous character" which is currently done by decrementing
> the pointer by 1.  But from what I currently understand that won't work if
> there are surrogate pairs.

If you're concerned about that, surrogate pairs are the least of your
worries. You should also be concerned about Unicode combining (or
composite) characters. I think they might be identified with ctypes
C3_NONSPACING and C3_DIACRITIC and that kind of stuff...

> 1. Does anyone know under what circumstances CharNextW isn't +1 (apart from
> when pointing at the terminating 0)

Have you tried low surrogate followed by high surrogate, on a Microsoft OS
recent enough that Microsoft *might* have thought about preparing it for
dealing with surrogates?

> 2. Is e.g. XP really using UTF-16 or is it actually still UCS2?

I don't know. But it probably ought to be UTF16.

> 3 Have we thought about how we should handle walking along a W string (in a
> fashion that doesn't reduced the speed to a crawl).  I guess that in the
> short term I am expecting some sort of macro or inline.

With p++, perhaps? There aren't very many circumstances where that is
going to be a problem (where unicode composite characters are not also),
is there?





More information about the wine-devel mailing list