- From: Glenn Adams <glenn@skynav.com>
- Date: Mon, 16 May 2011 23:05:42 -0600
- To: Cameron McCormack <cam@mcc.id.au>
- Cc: Vincent Hardy <vhardy@adobe.com>, "public-svg-wg@w3.org" <public-svg-wg@w3.org>
- Message-ID: <BANLkTi=jOFFWhKYW63HuSWbFgFjHp_u9fg@mail.gmail.com>
Cameron, Perhaps I wasn't clear, but in this example, see Example #1 in [1], 12 devanagari Unicode characters map to 4 glyphs in a devanagari font, where the: - the 1st glyph, a base glyph denoting a vowel, is derived from the char at index 10 - the 2nd glyph, a base glyph denoting the half form of a nukta-ized consonant, is derived from chars at index 2, 3, and 4 - the 3rd glyph, a base glyph denoting a ligature of three consonants, is derived from chars at index 5, 6, 7, 8, and 9 - the 4th glyph, a combining glyph denoting a ligature of two combining marks, is derived from chars at index 0, 1, and 11 At a minimum, an author should be able to perform the character to glyph mapping (and origin assignment) at authoring time, and use SVG to display the four glyphs at their desired locations. Since the glyph identifiers (glyph codes) used to refer to these glyphs are not in Unicode (because these are glyphs and not chars), then some means other than Unicode characters should be available to serve this function. That is my understanding of purpose of <glyphRef/>. Further, the authoring tool may wish to output these glyphs in an order that is distinct from either the original logical character order or from the visual order. For example, it is most convenient to interchange the 3rd and 4th glyphs, so that the combining mark can be assigned an advancement of 0, and followed by the base glyph whose origin is derived from the advancement and origin of the prior (2nd) base glyph. Otherwise, the origin of the combining glyph would need to take into account the advancement of the base glyph on which it is to be attached. Now, it appears that <text/>, etc., are defined to operate in the character domain, and not the glyph domain. However, properties such as x, y, dx, dy, rotation, are clearly applicable to the glyph domain, and not the character domain. As long as one works with simple writing systems that have a general 1:1 character to glyph mapping, then this doesn't appear to be a problem. However, in complex scripts, such as Arabic, and the family of Indic scripts, this is a rather serious problem of mixing apples and oranges, since a 1:1 mapping is the exception, not the rule. However, perhaps I am missing something here, so the problem may merely be my lack of full understanding of the defined mechanisms. One reason I am particularly interested in this at the moment is that I am in the process of adding Complex Script Support [2] to the Apache FOP Project [3], an implementation of XSL-FO, and in due course, I need to output glyphs with x and y origin offsets, and x and y advancements, on a per glyph basis. I already have this process working for PDF output, but SVG is on my list for output support, so I shall presently have to deal with SVG. Whether SVG can natively support both the necessary bidi processing and the complex character to glyph mapping (via opentype advanced typography tables or equivalent in truetype) remains a question. But that is the problem you have if you want SVG to handle character to glyph mapping even at line level. Note that in general, bidi processing operates first in the character domain on an entire paragraph, then secondly in the glyph domain (after line breaking) when reordering bidi embedding level segments. Note also that line breaking generally operates on the glyph domain, since characters have no geometry (only glyphs do). Regards, Glenn [1] http://www.microsoft.com/typography/otfntdev/devanot/features.aspx [2] http://skynav.trac.cvsdude.com/fop/wiki/ComplexScripts [3] http://xmlgraphics.apache.org/fop/ On Mon, May 16, 2011 at 10:01 PM, Cameron McCormack <cam@mcc.id.au> wrote: > Glenn Adams: > > I probably shouldn't throw this in, but I wonder how these semantics > would > > handle situations like Example 1 under "Examples [of] Devanagari > syllables" > > in [1], where a sequence of 12 Unicode characters maps to a sequence of 4 > > glyphs, and where the inverse association from glyph to generating > character > > indices are as follows: > > > > Glyph Char > > Index Indices > > > > 0 <- {10} > > 1 <- {2,3,4} > > 2 <- {5,6,7,8,9} > > 3 <- {0,1,11} > > > > The semantics of associating x offsets (and similar properties) with > > characters as opposed to glyphs seems rather disconnected to me. > > What I would expect from the above example is that if you have > > <text x="10 20 30 40 50 60 70 80 90 100 110 120 130 140"> > azzzzzzzzzzzzb > </text> > > where the zs are the 12 characters mapping to the single glyph that you > mention, then you would get the “a” glyph at x = 10, the Devanagari > glyph at x = 20 and the b glyph at x = 140. That’s what the rules in > http://www.w3.org/TR/SVG/text.html#TSpanElement say to do (the third > bullet beneath “The following additional rules apply …”). > > > One can certainly talk about associating x offsets with the output > > glyphs, but attempting to associate such properties with input > > characters that may be subjected to a complex, non-continuous, > > disjoint mapping to glyphs seems questionable, except in the special > > case of 1:1 continuous mappings. > > Yes, the x="" attribute there shouldn’t break up the complex glyph. > > -- > Cameron McCormack ≝ http://mcc.id.au/ >
Received on Tuesday, 17 May 2011 05:06:30 UTC