Actually libxml2 is supposed to take care of that. Relevant code from xmlSwitchInputEncodingInt() :<br><br>---<br> /*<br> * Specific handling of the Byte Order Mark for <br> * UTF-16<br> */<br>
if ((handler->name != NULL) &&<br> (!strcmp(handler->name, "UTF-16LE") ||<br> !strcmp(handler->name, "UTF-16")) &&<br> (input->cur[0] == 0xFF) && (input->cur[1] == 0xFE)) {<br>
input->cur += 2;<br> }<br> if ((handler->name != NULL) &&<br> (!strcmp(handler->name, "UTF-16BE")) &&<br> (input->cur[0] == 0xFE) && (input->cur[1] == 0xFF)) {<br>
input->cur += 2;<br> }<br>---<br><br>So we need to figure out what's broken and where before skipping it on our side.<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Dec 14, 2012 at 1:34 AM, Marcus Meissner <span dir="ltr"><<a href="mailto:marcus@jet.franken.de" target="_blank">marcus@jet.franken.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
Issue encountered in Corel X4 installer. (see bug 11747),<br>
it gets a UTF16 document with BOM in front.<br>
<br>
trace:msxml:domdoc_load (0x131d10)->({VT_BSTR: L"C:\\users\\marcus\\Temp\\{180A87F9-82B9-42A6-8F0E-3AC30A58AC8C}\\{5A51C731-A877-4C6E-8D51-CD16CDC46C9B}\\CorelDRAW Graphics Suite X4 Setup Files\\Setup.xml"})<br>
trace:msxml:create_moniker_from_url L"C:\\users\\marcus\\Temp\\{180A87F9-82B9-42A6-8F0E-3AC30A58AC8C}\\{5A51C731-A877-4C6E-8D51-CD16CDC46C9B}\\CorelDRAW Graphics Suite X4 Setup Files\\Setup.xml"<br>
trace:msxml:bind_url 0x131570<br>
trace:msxml:bsc_AddRef (0x138da0) ref=2<br>
trace:msxml:bsc_QueryInterface interface {6d5140c1-7436-11ce-8034-00aa006009fa} not implemented<br>
trace:msxml:bsc_QueryInterface interface {aaa74ef9-8ee7-4659-88d9-f8c504da73cc} not implemented<br>
trace:msxml:bsc_OnStartBinding (0x138da0)->(ff 0x13a238)<br>
trace:msxml:bsc_QueryInterface interface {79eac9e4-baf9-11ce-8c82-00aa004ba90b} not implemented<br>
trace:msxml:bsc_OnDataAvailable (0x138da0)->(5 36572 0x33e908 0x33e8fc)<br>
trace:msxml:bsc_OnStopBinding (0x138da0)->(00000000 (null))<br>
fixme:msxml:doparse start of xml is ffffffff fffffffe 3c 00 49 00 43<br>
<br>
Bug is only fixed partially by this.<br>
<br>
Ciao, Marcus<br>
---<br>
dlls/msxml3/domdoc.c | 10 ++++++++++<br>
1 file changed, 10 insertions(+)<br>
<br>
diff --git a/dlls/msxml3/domdoc.c b/dlls/msxml3/domdoc.c<br>
index 49e6168..3504204 100644<br>
--- a/dlls/msxml3/domdoc.c<br>
+++ b/dlls/msxml3/domdoc.c<br>
@@ -486,6 +486,16 @@ static xmlDocPtr doparse(domdoc* This, char const* ptr, int len, xmlCharEncoding<br>
sax_serror /* serror */<br>
};<br>
<br>
+ /* UTF-16 BOM at start of data */<br>
+ if ((len > 2) && (ptr[0] == (char)0xff) && (ptr[1] == (char)0xfe)) {<br>
+ ptr += 2;<br>
+ len -= 2;<br>
+ if (encoding == XML_CHAR_ENCODING_NONE)<br>
+ encoding = XML_CHAR_ENCODING_UTF16LE;<br>
+ else<br>
+ FIXME("Not changing xml encoding type from %d to XML_CHAR_ENCODING_UTF16LE.\n", encoding);<br>
+ }<br>
+<br>
pctx = xmlCreateMemoryParserCtxt(ptr, len);<br>
if (!pctx)<br>
{<br>
<span class="HOEnZb"><font color="#888888">--<br>
1.7.10.4<br>
<br>
<br>
<br>
</font></span></blockquote></div><br></div>