Re: ligature formation across text chunks

From: Vincent Hardy <vhardy@adobe.com>
Date: Fri, 13 May 2011 04:59:54 -0700
To: Cameron McCormack <cam@mcc.id.au>
CC: "public-svg-wg@w3.org" <public-svg-wg@w3.org>
Message-ID: <C62C904C-E13C-4DBA-9616-20A5F72A18B4@adobe.com>
Hi Cameron,

On May 12, 2011, at 10:16 PM, Cameron McCormack wrote:

> Vincent Hardy:
>> Yes, I see. I think what I did at the time was (again, I may be off):
>> 1. break first along the element boundaries (i.e., text and tspan)
>>   (this was for SVG tiny, so I did not have tref). This produces text
>>   chunks.
>> 2. do glyph matching
>> 3. position the glyphs according to their x/y/dx/dy values and/or
>>   glyph advances. This follows the rules you quoted before on x/y/dx/
>>   dy/rotate processing.
>> I think you are right: this is not what the letter of the spec.
>> says. But I think this is what it meant (i.e., do glyph matching on
>> character chunks) because otherwise, there would be no way to combine
>> character positioning and ligatures, as you said.
> Yes, and I think this is a good point.  Outputting exact, pre-layoutted
> x/y glyph positions from some script is going to prevent ligatures from
> forming.  It seems like text chunks in the spec are like CSS
> position:absolute elements.  This is sort of inconsistent with doing
> white space compression first.  You might also wonder what
> text-transform:capitalize should do – if they text chunks are
> independent enough not to do bidi resolution and ligature formation
> across, then you probably don’t consider the characters as being part of
> the same word; that makes <text x="10 20…" text-transform="capitalize">
> useless, though.
>> And if it works as I describe, you can still forbid a ligature if
>> needed:
>> <text><tspan>f</tspan>i</text>
> (Or with an appropriate CSS3 font-variant property.)

> I don’t know that you always want to make a new chunk at a tspan.  

I meant to add an x attribute on the tspan, and that was to say that you could force a new text chunk if x/y did not do it (i.e., if glyph matching was done before x/y processing but after breaking chunks. Anyway, I think I am on the same page as you are.
> At
> least, that would be inconsistent with HTML/CSS, where for example bidi
> resolution happens across the element boundary:
> data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw%2BDQo8c3R5bGU%2BDQpwIHsgY29sb3I6IGJsdWUgfQ0Kc3BhbiB7IGNvbG9yOiByZWQgfQ0KPC9zdHlsZT4NCjxwPmhlbDxzcGFuPmxvINeQ15U8L3NwYW4%2B15vXnCB0aGVyZTwvcD4NCg%3D%3D
> Glyph positioning with x & y seems to exist to serve two use cases: 1)
> exact positioning of glyphs because you have already laid them out in
> whatever tool is generating the document, and 2) for creating distinct
> lines or runs of text.  It seems for the former you don’t want
> properties like text-transform, bidi resolution or ligatures to stop
> working.  For the latter, you probably do.
> I wonder if, to handle use case #2, we shouldn’t make authors stick
> their separate runs of text in separate <text> elements.  One argument
> the spec makes for doing multi-line text with positioned <tspan>s within
> the one <text> is so that they can all be selected contiguously.  We
> have already discussed dropping the requirement that you cannot select
> across multiple <text> elements in a document, however.

I have mixed feelings about that because it would change the semantic a bit. Currently, a <text> element can represent a paragraph (your use case #2) and it would no longer be the case here (they would always be lines). 

> -- 
> Cameron McCormack ≝ http://mcc.id.au/

