- From: Chris Lilley <chris@w3.org>
- Date: Fri, 10 Dec 2004 04:25:41 +0100
- To: "Addison Phillips [wM]" <aphillips@webmethods.com>
- Cc: www-svg@w3.org, mark.davis@jtcsv.com, w3c-i18n-ig@w3.org
On Friday, December 10, 2004, 12:19:28 AM, Addison wrote: APw> Dear SVG WG: APw> This email is sent on behalf of the Internationalization Working Group. APw> In the message below and as discussed with a few of your WG APw> members at the recent AC meeting, we feel that the line breaking APw> algorithm in section 4.12 of SVG 1.2 Full is problematic. In the APw> email below (which is slightly edited to correct some errors from APw> the one originally sent to the I18N-IG list, please note), I have APw> attempted to describe the problem we discussed in that meeting and APw> possible solutions that we have considered in I18N subsequently. Many thanks for that. APw> We would like to figure out the best method of working with APw> you to resolve this problem. Would cross-posting email or forming a APw> Task Force make the most sense to you? Either of those wold work depending on the anticipated scope and duration. APw> Or do you have other APw> preferences? We would be happy to send representatives to discuss APw> working out the details with you, if that makes sense. I must admit that I was very surprised to hear that this was not already well documented, and am pleased that an existing UAX covers this. >> -----Original Message----- >> From: Addison Phillips [wM] [mailto:aphillips@webmethods.com] >> Sent: 2004?12?7? 14:46 >> To: w3c-i18n-ig@w3.org >> Cc: mark.davis@jtcsv.com >> Subject: SVG 1.2 and line breaking... >> >> >> At the recent W3C AC meeting, Richard Ishida and I took the time >> to have a meeting with Chris Lilley and others from the SVG WG. >> We discussed the problems with SVG 1.2 and the way in which it >> implements line breaking and line wrapping. Richard and I gave >> the impression that multiline bidi layout was imperfectly >> documented, but that appears not to be the case. UAX#9 seems to >> document the really tough bidi stuff. >> >> The conclusion we came to is that I18N WG needs to submit a >> formal comment (or set of comments) on this topic. The basic idea >> that we discussed in our meeting is: >> >> 1. SVG will provide two line breaking modes. >> >> a. The default will be 'auto', which MAY be implementation >> defined and SHOULD be conformant with UAX#9 and UAX#14 (i.e. the >> idea is that it should be more, rather than less, capable to use >> 'auto'). I agree that 'more rather than less' is key here. I would like to see wording that, if nothing better can be provided, then its the same as the 'other' (reproducible graphics) mode. I don't want it to be used as a loophole for doing less. I agree, after discussions, that both modes have their use cases. >> Auto mode will not guarantee consistent line breaking >> across implementations or within differently configured >> implementations. But it may provide a higher level of language >> awareness, etc. >> >> b. The "other" mode (which needs a name) will be closely >> described by SVG 1.2. There must be an option that allows for >> strict UAX#14/UAX#9 based breaking that will be consistent in >> layout result across implementations given SVG fonts. This mode >> should also offer language specific tailoring and/or options. For >> example, for Korean text one might choose space or character >> based breaking. We note that UAX#14 leaves some leeway for >> certain operations to the implementation. >> >> 2. The wrapping algorithm currently in 4.12 must be scrapped, >> since it proceeds from (numerous, fatal) false assumptions about >> the layout of text. I have included below a prototype for a new >> algorithm, which must be substantially fleshed out. Comments are >> very welcome. Vertical layouts have issues left undiscussed here. >> See for example APw> http://fantasai.inkedblade.net/style/discuss/vertical-text/#css3-text APw> for just how much fun we are in for. That sounds pretty suboptimal. However, >>> CSS3 Text maps vertical scripts' character directionality based on >>> the paragraph's block progression. SVG has included vertivcal as well as horizontal text from the beginning and thus, has modelled on XSL property values with before/after and start/end. It does not, in consequence, have legacy 'left means down except when it means up' type issues. APw> 3. Richard suggested, in fact, that our on-going discussion APw> with the CSS WG concerning CSS3 (most notably a thread with APw> "fantasai", the author of the above link) form a basis for SVG's APw> design. See APw> http://lists.w3.org/Archives/Member/w3c-i18n-wg/2004Oct/thread.html APw> whose first message is: APw> http://lists.w3.org/Archives/Member/w3c-i18n-wg/2004Oct/0002.html APw> (but follow the thread). Yes, that is helpful discussion. APw> -- [[ A Rough-and-Ready Prototype]]-- APw> 1. Each paragraph is processed according to the Unicode APw> Bidirectional Algorithm in Unicode Standard Annex #9 [UAX#9] in APw> order to determine directionality and embedding levels for each APw> character. Base directionality may be defined by the containing APw> document. APw> 2. Each paragraph is then processed in logical order to APw> determine line breaking opportunities between characters, according APw> to Unicode Standard Annex #14 [UAX#14]. The specific options for APw> the paragraph's script and language are applied here as APw> appropriate. This results in "break segments", which consist of APw> character strings [see CharMod Part1: Fundamentals, section 6.1] APw> that are bounded on both ends by a line breaking opportunity (or APw> the start or end of the paragraph). APw> 3. The "starting position", "next pointer" and "current APw> pointer" are each set to the (logical) start of the next paragraph APw> in the text. APw> 4. The "next pointer" is set to the character that represents APw> the next break opportunity following the "current pointer's" APw> position. APw> 5. Text layout is performed on a single line of the all of APw> the text between the "starting position" and the "next pointer". APw> 6. If the text in (5) does not exceed the size of the current APw> strip and text remains in the paragraph, set the current pointer = APw> next pointer and go to (4). APw> 7. Otherwise place the rendered text into the strip, set APw> "starting position" = "current pointer" and "next pointer" = APw> "current pointer" and increment the strip. APw> 8. If text remains in the paragraph, go to (4). That sounds good. I will forward it to a developer who is implementing this; hopefully we can have running code to test it out. APw> -- APw> Special considerations: APw> 1. If soft hyphens are used to form breaks, then implementers APw> should specifically consider UAX#14 section 5.2 "Use of soft APw> hyphen". In particular, breaking on a soft hyphen may result in APw> spelling or form changes in certain languages and scripts. In the 'auto' mode, or in both modes? (This is about ß line breaking to s s, for example?) APw> 2. Reshaping in Unicode does not cross directional APw> boundaries, so this can be used to optimize performance in some APw> cases. Yes we already have this notion in SVG 1.1 text chunks. APw> 3. Some characters in Unicode take their shape from their APw> current directionality. For example, opening and closing APw> parenthesis change the direction in which they point based on their APw> context. See TUS 4.0 section 4.7 for a discussion of mirroring. APw> Note that mirroring can produce different advance widths or heights APw> as a result. APw> 4. Text at the end of the line renders differently than text APw> in the middle of a line. For example, spaces are generally not APw> rendered at the end of a line. Implementations should be careful of APw> "optimizations" that do not layout the entire line again and just APw> concatenates segments of glyphs. (Note that shaping of characters APw> may be affected in some scripts when the text doesn't occur at the APw> end). APw> 5. "Emergency breaking" may be required if some line of text APw> is too long to fix any of the remaining strips. The form this takes APw> is ????? procrustean???? APw> 6. When a word is added the line height may increase, it can APw> never decrease from the first glyph rendered. An increase in the APw> line height can only reduce the space available for text placement APw> in the span. In the algorithm described above, the line height must APw> be calculated on the text actually inserted (i.e. between starting APw> and current position) and *not* be based on the line height of the APw> last layout pass in step 5. APw> 7. In (5) note that rendering is done on a line oriented to APw> the current and base directionality. For example, vertical APw> rendering is done on a vertical line. APw> 8. Note that in (3) spans of text may be labeled with a APw> different language or use scripts to which different breaking APw> options may apply. Options selected should be applied as APw> appropriate for each span of text. Thanks, these concrete and specific suggestions are very helpful. APw> -- APw> Addison P. Phillips APw> Director, Globalization Architecture APw> http://www.webMethods.com APw> Chair, W3C Internationalization Working Group APw> http://www.w3.org/International APw> Internationalization is an architecture. APw> It is not a feature. -- Chris Lilley mailto:chris@w3.org Chair, W3C SVG Working Group Member, W3C Technical Architecture Group
Received on Friday, 10 December 2004 03:25:41 UTC