Re: ligature formation across text chunks

Vincent Hardy:
> Yes, I see. I think what I did at the time was (again, I may be off):
>
> 1. break first along the element boundaries (i.e., text and tspan)
>    (this was for SVG tiny, so I did not have tref). This produces text
>    chunks.
> 2. do glyph matching
> 3. position the glyphs according to their x/y/dx/dy values and/or
>    glyph advances. This follows the rules you quoted before on x/y/dx/
>    dy/rotate processing.
> 
> I think you are right: this is not what the letter of the spec.
> says. But I think this is what it meant (i.e., do glyph matching on
> character chunks) because otherwise, there would be no way to combine
> character positioning and ligatures, as you said.

Yes, and I think this is a good point.  Outputting exact, pre-layoutted
x/y glyph positions from some script is going to prevent ligatures from
forming.  It seems like text chunks in the spec are like CSS
position:absolute elements.  This is sort of inconsistent with doing
white space compression first.  You might also wonder what
text-transform:capitalize should do – if they text chunks are
independent enough not to do bidi resolution and ligature formation
across, then you probably don’t consider the characters as being part of
the same word; that makes <text x="10 20…" text-transform="capitalize">
useless, though.

> And if it works as I describe, you can still forbid a ligature if
> needed:
>
> <text><tspan>f</tspan>i</text>

(Or with an appropriate CSS3 font-variant property.)

I don’t know that you always want to make a new chunk at a tspan.  At
least, that would be inconsistent with HTML/CSS, where for example bidi
resolution happens across the element boundary:

data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw%2BDQo8c3R5bGU%2BDQpwIHsgY29sb3I6IGJsdWUgfQ0Kc3BhbiB7IGNvbG9yOiByZWQgfQ0KPC9zdHlsZT4NCjxwPmhlbDxzcGFuPmxvINeQ15U8L3NwYW4%2B15vXnCB0aGVyZTwvcD4NCg%3D%3D

Glyph positioning with x & y seems to exist to serve two use cases: 1)
exact positioning of glyphs because you have already laid them out in
whatever tool is generating the document, and 2) for creating distinct
lines or runs of text.  It seems for the former you don’t want
properties like text-transform, bidi resolution or ligatures to stop
working.  For the latter, you probably do.

I wonder if, to handle use case #2, we shouldn’t make authors stick
their separate runs of text in separate <text> elements.  One argument
the spec makes for doing multi-line text with positioned <tspan>s within
the one <text> is so that they can all be selected contiguously.  We
have already discussed dropping the requirement that you cannot select
across multiple <text> elements in a document, however.

-- 
Cameron McCormack ≝ http://mcc.id.au/

Received on Friday, 13 May 2011 05:17:19 UTC