[PATCH v3 1/2] kernelbase/locale: Implement comparison on top of official unicode weight tables

Alexandre Julliard julliard at winehq.org
Wed Mar 4 15:40:57 CST 2020


Fabian Maurer <dark.shadow4 at web.de> writes:

> Hello Alexandre,
>
>> Multi-language support, Japanese, Korean, multi-char sequences,
>> surrogates, linguistic mappings, etc.
>>
>> There are a million things that need to be supported for proper
>> sorting. You don't have to implement them all, but it should be clear
>> from your approach that they can be added. Which in practice means you
>> need to at least prototype most of them.
>
> Well, they can be added, it's just that I left them out for the initial
> versions...
> Short breakdown:
>
> - Multi-language: The character is looked up the current language, as a
> fallback the default is used. Currently, only the default is implemented

I don't see any language support, there's just one big sortkey
table. Yes, that's what the current code is doing too, but if we are
rewriting it, we should get the architecture right.

> - Multi-char sequences: You man when a single codepoint is encoded as more
> than one WCHAR? Is supported, windows seems to treat each WCHAR separately

I mean when multiple chars map to one sortkey. The COMPRESSION sections
in the Microsoft table.

> - Linguistic mappings: Not sure what you mean, sorry

NORM_LINGUISTIC_CASING and the like.

> Question: How should I prove it works? I can't possible add all of that in the
> first draft.

The usual way is to add a bunch of tests with todo_wine, and then send a
patch series with each patch removing the corresponding todos.

>> We only have tests for a very small number of strings, that's clearly
>> not proper coverage. Some way of systematically generating test strings
>> should be considered.
>
> Like, random strings from a known seed? I intentionally didn't do that,
> because of performance concerns.

Not necessarily random, but some interesting data. For instance the
normalization tests can run the entire test suite from unicode.org, you
may be able to find something similar. Or build your own somehow.

>> Also testing sort keys directly, like you did in
>> the first try (but without depending on the exact values).
>
> I've that planned, yes. Do you want that in the first version already?

The tests should come before the code, or at the same time.

>> Note that we most likely want to use a Windows-compatible NLS file, like
>> we are now using for codepage or normalization tables. I can work on
>> that part.
>
> I have to admit, I don't know what you mean by that. I don't know about NLS
> files.

This is new stuff. Look at the nls directory, and at the make_unicode
script.

-- 
Alexandre Julliard
julliard at winehq.org



More information about the wine-devel mailing list