Re: [css3-text] 'First letter' delimitation from Asmus Freytag on 2010-10-13 (www-international@w3.org from October to December 2010)

From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Wed, 13 Oct 2010 16:14:58 -0700
To: John Hudson <tiro@tiro.com>
CC: "Phillips, Addison" <addison@lab126.com>, Behdad Esfahbod <behdad.esfahbod@gmail.com>, Simon Montagu <smontagu@smontagu.org>, "Tab Atkins Jr." <jackalmage@gmail.com>, Andrew Cunningham <lang.support@gmail.com>, style <www-style@w3.org>, wwwintl <www-international@w3.org>, intlcore <public-i18n-core@w3.org>, indic <public-i18n-indic@w3.org>, Richard Ishida <ishida@w3.org>
Message-ID: <4CB63D72.2090502@ix.netcom.com>

  On 10/13/2010 9:21 AM, John Hudson wrote:
> Phillips, Addison wrote:
>
>> I think that's what I'm saying? That is, for example the CSS rule 
>> 'first-letter' should be applied to the first grapheme cluster, not 
>> to the first Unicode code point. Page authors should not have to do 
>> anything grapheme specific in their markup in order to get at the 
>> graphemes with e.g. CSS rules or JavaScript. They might need to 
>> include @lang to help the user-agent. But they shouldn't need to find 
>> and mark-up the grapheme cluster themselves.
>
> I agree that they shouldn't need to, but I'm interested in having a 
> standard way to do so if automated grapheme identification fails e.g. 
> because the software has insufficient or inaccurate information about 
> the language in question. Also, I expect there to be variation in 
> typographic preference among language user who, for instance, include 
> digraphs (trigraphs, etc.) as letters in their alphabets.
>
>

This is the typical situation where an "automatic" algorithm can't 
guarantee to get it right for all users, so you do want an override. 
However, the automatic solution is going to be fine for a very large 
class of users/documents, so you don't want to require markup.

So far, I agree with both sides of this discussion.

But: in that case, you have the problem *when* to use the override. If 
the author can't predict the algorithm that's actually used to make the 
cluster determination, then s/he will need to apply markup for a wide 
array of cases, just to be safe.

Therefore, in these "manually assisted" cases you need to tighten down 
what the automatic rules are, so authors can predict whether their texts 
(languages) are going to be covered.

That would seem then to boil down to firmly prescribing UAX#29 or a 
tailoring thereof.
A./

Received on Wednesday, 13 October 2010 23:15:50 UTC