[Bug 7150] Implement Arabic shaping

wine-bugs at winehq.org wine-bugs at winehq.org
Mon Oct 26 15:28:11 CDT 2009


http://bugs.winehq.org/show_bug.cgi?id=7150





--- Comment #16 from Shachar Shemesh <shachar at shemesh.biz>  2009-10-26 15:28:11 ---
(In reply to comment #14)
> 
> Shachar! why haven't you implemented your docs ?
> 
Perhaps when I have time to even finish them....

> according to me the proper BiDi handling is :-)
> - Bidi level calculations
> - Line breaking
> - Reodering
> - Shaping
> 

The main problem with this order of doing things is that shaping may change the
width of the character, which affects the line breaking algorithm.

For (an extreme) example, try writing a long sequence of U0644 (Lam), U0627
(Alef), and U0020 (space) ("No no no no no...."). The length (in pixels) of
each word is dramatically different before and after shaping. As such, a line
break algorithm working on the unshaped paragraph will make drastically
different decisions regarding where to break the line (how many "La" fit in a
single line) whether it sees the string before or after shaping.

The only sane way of handling this is to perform the line breaking after the
shaping, when the final width of each character is known.

Yes, I know, it's a drastic change to the algorithm.

> my code is very optimized, so if you just want to move things for optimization
> purpose just forget it, you won't get a better performance, that would be near
> impossible.

I want to move things to make things more correct, not more optimized.

> 
> why did I made it the last step of BiDi process ?
> because shaping changes the characters which would interfere the Reodering
> process
> 
> for example a Reodering routine might know that U+639 is RTL but it may/may not
> know that U+FECC is RTL too (many Reodering routines in many project don't
> handle this case, I don't know about wine)
>
That is precisely why we calculate the BiDi levels first. Actual reordering
takes place based on the BiDi levels. The actual character database is never
consulted again.

> 
> and since I don't want to change any other function, I made it after Reodering.

Which is why I said this does not mean it cannot go in, just that it will,
eventually, have to be replaced.

> 
> the case where Shaping should be done before reodering is when the char length
> of U+0639 is less that shaped U+FECC
> but this is not the case in wine (UTF-16) so the positions calculated by your
> reordering are granted to be valid as my function is 1-1.

Like I said above, this is more about pixel width of the characters than it is
about the byte length of their representation.
> 
> the right thing to do is to use HarfBuzz which is in its way as a standalone
> library.
> http://fedoraproject.org/wiki/Features/Harfbuzz
> 
> but since this is a decision to be made in winehq, this is the only thing I can
> do.

The right thing to do is to move the BiDi implementation to Uniscribe, and
split it the way Windows does. Until we do, our implementation is going to be
partial no matter what algorithm or library we use.

Shachar

-- 
Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email
Do not reply to this email, post in Bugzilla using the
above URL to reply.
------- You are receiving this mail because: -------
You are watching all bug changes.



More information about the wine-bugs mailing list