[PATCH 3/3] msxml3: Don't force UTF-8 when saving XML document.

Nikolay Sivov nsivov at codeweavers.com
Mon Mar 22 08:52:04 CDT 2021


On 3/22/21 4:23 PM, Dmitry Timoshkov wrote:
> Nikolay Sivov <nsivov at codeweavers.com> wrote:
>
>> On 3/19/21 6:25 PM, Dmitry Timoshkov wrote:
>>> This is the only place where xmlSaveToIO() is forced to use UTF-8 for an
>>> output document, other places specify NULL for the default encoding.
>> It's because get_xml() and save() are different. UTF-8 is used together
>> with bstr_from_xmlChar().
> Since that change doesn't break current tests I'd guess that either current
> behaviour is based on some guess work. Could you please add the tests to show
> the difference, and that won't work with my patch?
The difference is that save() respects encoding, and get_xml() always
returns UTF-16, with no encoding attribute. Your patch does not fix that.
>
>>> This doesn't completely fix the saved XML contents, but at least XML document
>>> has proper encoding now.
>> What is the proper encoding if output is always in WCHARs?
> I have an application that expects to get such an XML in the encoding
> specified in the ProcessingInstruction, like the test in 1/3 does. In fact,
> the application asks for encoding that matches current ANSI codepage, and
> doesn't expect to get UTF-8 vs cp1251 which are completely different. As
> you can probably see with current code my application is utterly broken.
What is broken? What does it expect in returned BSTR? There are no test
changes associated with 3/3, so why have it.

Correct output won't depend on specified encoding, once document is loaded.
>
>>> diff --git a/dlls/msxml3/domdoc.c b/dlls/msxml3/domdoc.c
>>> index a81ef5f16cb..49596999d16 100644
>>> --- a/dlls/msxml3/domdoc.c
>>> +++ b/dlls/msxml3/domdoc.c
>>> @@ -1405,7 +1405,7 @@ static HRESULT WINAPI domdoc_get_xml(
>>>          return E_OUTOFMEMORY;
>>>  
>>>      options = XML_SAVE_FORMAT | XML_SAVE_NO_DECL;
>>> -    ctxt = xmlSaveToIO(domdoc_get_xml_writecallback, NULL, buf, "UTF-8", options);
>>> +    ctxt = xmlSaveToIO(domdoc_get_xml_writecallback, NULL, buf, NULL, options);
>>>  
>>>      if(!ctxt)
>>>      {
>> Correct way to fix formatting and encoding issues is to reimplement node
>> dumping functionality in msxml itself.
> I guess that's a large undertaking, are you planning to work on this? If not,
> then I think that the proposed fix might be of a compromise solution.




More information about the wine-devel mailing list