Re: ligature formation across text chunks

I probably shouldn't throw this in, but I wonder how these semantics would
handle situations like Example 1 under "Examples [of] Devanagari syllables"
in [1], where a sequence of 12 Unicode characters maps to a sequence of 4
glyphs, and where the inverse association from glyph to generating character
indices are as follows:

Glyph  Char
Index  Indices

0   <- {10}
1   <- {2,3,4}
2   <- {5,6,7,8,9}
3   <- {0,1,11}

The semantics of associating x offsets (and similar properties) with
characters as opposed to glyphs seems rather disconnected to me. One can
certainly talk about associating x offsets with the output glyphs, but
attempting to associate such properties with input characters that may be
subjected to a complex, non-continuous, disjoint mapping to glyphs seems
questionable, except in the special case of 1:1 continuous mappings.

Regards,
Glenn Adams

[1] http://www.microsoft.com/typography/otfntdev/devanot/features.aspx

On Thu, May 12, 2011 at 11:16 PM, Cameron McCormack <cam@mcc.id.au> wrote:

> Vincent Hardy:
> > Yes, I see. I think what I did at the time was (again, I may be off):
> >
> > 1. break first along the element boundaries (i.e., text and tspan)
> >    (this was for SVG tiny, so I did not have tref). This produces text
> >    chunks.
> > 2. do glyph matching
> > 3. position the glyphs according to their x/y/dx/dy values and/or
> >    glyph advances. This follows the rules you quoted before on x/y/dx/
> >    dy/rotate processing.
> >
> > I think you are right: this is not what the letter of the spec.
> > says. But I think this is what it meant (i.e., do glyph matching on
> > character chunks) because otherwise, there would be no way to combine
> > character positioning and ligatures, as you said.
>
> Yes, and I think this is a good point.  Outputting exact, pre-layoutted
> x/y glyph positions from some script is going to prevent ligatures from
> forming.  It seems like text chunks in the spec are like CSS
> position:absolute elements.  This is sort of inconsistent with doing
> white space compression first.  You might also wonder what
> text-transform:capitalize should do – if they text chunks are
> independent enough not to do bidi resolution and ligature formation
> across, then you probably don’t consider the characters as being part of
> the same word; that makes <text x="10 20…" text-transform="capitalize">
> useless, though.
>
> > And if it works as I describe, you can still forbid a ligature if
> > needed:
> >
> > <text><tspan>f</tspan>i</text>
>
> (Or with an appropriate CSS3 font-variant property.)
>
> I don’t know that you always want to make a new chunk at a tspan.  At
> least, that would be inconsistent with HTML/CSS, where for example bidi
> resolution happens across the element boundary:
>
>
> data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw%2BDQo8c3R5bGU%2BDQpwIHsgY29sb3I6IGJsdWUgfQ0Kc3BhbiB7IGNvbG9yOiByZWQgfQ0KPC9zdHlsZT4NCjxwPmhlbDxzcGFuPmxvINeQ15U8L3NwYW4%2B15vXnCB0aGVyZTwvcD4NCg%3D%3D
>
> Glyph positioning with x & y seems to exist to serve two use cases: 1)
> exact positioning of glyphs because you have already laid them out in
> whatever tool is generating the document, and 2) for creating distinct
> lines or runs of text.  It seems for the former you don’t want
> properties like text-transform, bidi resolution or ligatures to stop
> working.  For the latter, you probably do.
>
> I wonder if, to handle use case #2, we shouldn’t make authors stick
> their separate runs of text in separate <text> elements.  One argument
> the spec makes for doing multi-line text with positioned <tspan>s within
> the one <text> is so that they can all be selected contiguously.  We
> have already discussed dropping the requirement that you cannot select
> across multiple <text> elements in a document, however.
>
> --
> Cameron McCormack ≝ http://mcc.id.au/
>
>

Received on Friday, 13 May 2011 05:45:04 UTC