RE: [css3-text] New Working Draft


Hi.
I forgot to send the attachments (1), an image showing the hyphenation with word heaping of the hyphenated part in a Q'uran and also (2), an html page for showing browser handling of aleph with tanween-al-fatah --
for those at unicode who recall the discussion

www.unicode.org/mail-arch/unicode-ml/y2010-m03/0065.html
 
It's also discussed at an online forum:
 
www.ahlalhdeeth.com/vb/archive/index.php/t-127442.html
 
For IE 8 at least whether educated speakers still type the tanween-al-fatah before typing the aleph seat is a moot issue.  With IE8 at least (I don't have IE9 yet and may not get it anytime soon), Microsoft has decided to represent the missing seat with a dotted circle if tanween-al-fatah is typed before the aleph at word's end.
 
(Oh well. So computer people will have to type it after the aliph if they use IE8.  As for Arabic hyphenation, I do think that was dicussed at unicode too or somewhere but cannot find that discussion.)
 
Best,
 
--C. E. Whitehead
cewcathar@hotmail.com 

 


From: cewcathar@hotmail.com
To: fantasai.lists@inkedblade.net; www-international@w3.org; public-i18n-core@w3.org; public-i18n-indic@w3.org; public-i18n-cjk@w3.org
CC: sillixn--mlform-iua@xn--mlform-iua.no
Date: Tue, 19 Apr 2011 15:49:52 -0400
Subject: RE: [css3-text] New Working Draft




Hi.

Following is my remaining feedback on the draft:
http://dev.w3.org/csswg/css3-text/

 
The proofreading stuff I mention should be fixed.
I put that first.  No need to hurry a response to anything else here.
-- 
Leif Halvard Silli Leif Halvard Sillixn--mlform-iua@xn--mlform-iua.no

> * Nit: There lacks a comma between 'Ethiopian' and 'Greek'
Yes.
 
"discrete scripts 

Scripts that use spaces or visible word-separating punctuation between words and have discrete, unconnected (in print) units within words. The following scripts are included: Armenian, Bamum?, Braille, Canadian Aboriginal, Cherokee, Coptic, Cyrillic, Deseret, Ethiopic Greek, Hebrew, Kharoshthi, Latin, Lisu, Osmanya, Shavian, Tifinagh, Vai?"=>
"discrete scripts 

Scripts that use spaces or visible word-separating punctuation between words and have discrete, unconnected (in print) units within words. The following scripts are included: Armenian, Bamum?, Braille, Canadian Aboriginal, Cherokee, Coptic, Cyrillic, Deseret, Ethiopic, Greek, Hebrew, Kharoshthi, Latin, Lisu, Osmanya, Shavian, Tifinagh, Vai?"
 

 ---
More proofreading nits
4.3
first bulleted item, sub-item # 4
"4. Any space immediately following another collapsible space —even one outside the boundary of the inline—is removed"
{ Can you anyway fix the spacing on the dashes?  I think elsewhere you have dashes surrounded by spaces, that is with a space on each side }
* * *
4.3 Paragraph beginning "Then," following bullets item # 4
"4.  f spaces or tabs at the end of a line are non-collapsible but have ‘text-wrap’ set to ‘normal’ or ‘avoid’ the UA may visually collapse their character advance widths."
{ COMMENT:  possibly you should separate the 'if'-'then' clauses with a comma }
=> ?
"4. If spaces or tabs at the end of a line are non-collapsible but have ‘text-wrap’ set to ‘normal’ or ‘avoid,’ the UA may visually collapse their character advance widths. "
* * *
6.6.1 second bullet
"•Requires language-tagging, which also enables the correct use of local dictionaries (and can also trigger other typographic improvements). "
{ COMMENT:  I don't think you need the first "also;" in fact this sentence lacks the necessary reference for "also" -- that is, "also" should mean in addition to something else and there is nothing in this bulleted item preceding it that it is in addition to; but it makes the reader look for something to precede it in this item.
=> "•Requires language-tagging, which enables the correct use of local dict
* * *
8.3  Example XIV  "3.8 Line Adjustment in [JLREQ]"
"It describes rules for cases where the 'text-justify' property is ‘inter-ideograph’. It describes rules for cases where the ‘text-justify’ property is 'inter-ideograph' and the 'text-spacing' property does not specify 'no-justify'."
{ COMMENT:  Do you want to say "It describes rules for cases where the 'text-justify' property is 'inter-ideograph':  that is are you distinguishing cases where it is inter-ideograph from where it is inter-ideograph and the 'text-spacing' property does not specify 'no-justify'? }
* * *
9.3  "Non-ideographic letters
 "Is defined as Ideographic letters"
{ COMMENT:  Verb-subject complement agreement is off here; the verb is singular; the subject complement is plural;
you should either make the subject complement singular or else use the preposition "under" to indicate that "ideographic letters" refers to a category not to the character itself. }
=>
"* Is defined as ideographic letter"
or =>
"* Is defined under Ideographic letters"
* * *
11.1  "Line Decoration . . . "  Par 1, 2nd sentence
"When specified on or propagated to an inline box, such decoration affects all the boxes generated by that element, and is further propagated to any in-flow block-level element that split the inline (see CSS2.1 section 9.2.1.1)"
{ COMMENT:  the sentence needs a full stop at the end. }
=>
"When specified on or propagated to an inline box, such decoration affects all the boxes generated by that element, and is further propagated to any in-flow block-level element that split the inline (see CSS2.1 section 9.2.1.1)."
* * *
11.2.1  Just a question:  What do you mean by "Unicode-wide" at the end?
 
* * *
Answers to your issues/questions -- just bikeshedding I think (hope I understand bikeshedding).

4 last par 
"however, that this will usually result in them being rendered as missing glyphs.) Issue:What's the line-breaking behavior? Effects on joining? Can we just copy the behavior of some zero-width Unicode character? "
{ COMMENT:  To my understanding, after a line break, bidi formatting characters may either continue to affect the text or not; for hard breaks, that is for paragraph breaks they are closed first automatically and so do not affect the following text, right?
I can check the bidi list archives on this too; I believe you have that info. already however. }
 
* * *

4.1 whitespace collapsing
{ COMMENT:  In answer to your question, the space value should be the most common way to do this; however, sure, I think you can have a length value as an alternative to this but don't do away with counting spaces here. }
* * *
6.4. Hyphenation Character Limits: the ‘hyphenate-limit-word’ property
Name:  hyphenate-limit-chars  
6.5. Hyphenation Line Limits: the ‘hyphenate-limit-lines’ and ‘hyphenate-limit-last’ properties
 
"If three values are specified, the first value is the required minimum for the total characters in a word, the second value is the minimum for characters before the hyphenation point, and the third value is the minimum for characters after the hyphenation point. If the third value is missing, it is the same as the second. If the second value is missing, then it is ‘auto’. The ‘auto’ value means that the UA chooses a value that adapts to the current layout. 
"▶ Unless the UA is able to calculate a better value, it is suggested that ‘auto’ means 2 for before and after, and 5 for the word total. "

{ COMMENTS: hyphenate-limit-last is o.k. with me.
 
For the others, hyphenate-limit-chars and hyphenate-limit-lines -- here are my notes on English hyphenation:

First, for resources on English hyphenation, see:
www.melbpc.org.au/pcupdate/9100/9112article4.htm (the rule is not to break up words in such a way as to get other words and cause confusion; for example you would not break up "malleable" into "mall" and "eable" but I don't think that's the correct division of the syllables in this case anyway; 
another rule at least used to be:
* do not hyphenate before syllables such as  
-ed, -ing, -es (plural)

even if the previous consonant is doubled to add these).

Thus as you know hyphenation for English at least is largely lexically-determined, and has nothing to do with counting characters that I know of.  So I don't understand the info about the character counting.  Oh well.  I don't expect you to change this, 
although I don't know why you need to limit the total lines hyphenated either.
All else seems fine, at least for English hyphenation.
I sent info. on Arabic previously. See the email under "RE: [css3-text] script categories, 'bicameral', 'discrete', Unicode links and more‏."  It seems that Arabic text even the Q'uran can be hyphenated on occasion with the fragment following the hyphen heaped, 
although what I have seen primarily is heaping and probably the stretching out of the connection  to the final "ya'a" or "siin" or "shiin" or "noon" letter -- all are connecting characters for what that's worth and it may be irrelevant . . .  the info online seems to support stretching out. 
I forgot to attach the image of heaped hyphenated text to my previous email; do you need it? }
 
* * *
6.3
 
{ COMMENT:  in answer to your question, hyphenate-limit-zone is a good name (in just my opinion however & unimportant) }
 
* * *
6.6.1
---
Frank Ellerman wrote
> •Can use RFC4647 language-mapping, which is more intelligent than :lang()'s prefix-matching.
> (Could also argue that :lang() should use RFC4647.)"
{ Resource is as Frank Ellerman has noted RFC 4647 and I would say this is the best to use; 
 I think you also want . . . 5646 too however.} 

* * *
7.1
"When ‘text-wrap’ is set to ‘normal’ or ‘avoid’, UAs that allow breaks at punctuation other than spaces should prioritize breakpoints. For example, if breaks after slashes have a lower priority than spaces, the sequence "check /etc" will never break between the ‘/’ and the ‘e’"
{ COMMENT :  and when you have a column width set to one character? It still won't break between the / and the e?  This is a nit however. }

* * *
 
9.3 "fullwidth opening punctuation"
{ COMMENT:  I'm unsure whether the French opening quotation mark  (U+00AB) is full-width and should be listed here. }
 
* * *
9.3 "fullwidth closing punctuation"
{ COMMENT:  I'm unsure whether the French closing quotation mark  (U+00BB) is full-width and should be listed here. }
* * *

11.4  
 
{ COMMENT:  to answer your question, text-outline: normal,|heavy|light  is a good idea. }
* * *
11.4 
{ COMMENT, to answer your question, text-shadow is a good property. }
* * *
Another nit:

11. Text Decoration
{ COMMENT:  how are you handling diacritics, which you mentioned in an earlier draft?  As noted on this list previously, Microsoft Windows IE explorer version 7  displays the combination of the tanween-al-fatah diacritic and the aleph at word closing slightly differently depending on typing order; we argued previously on the list about whether both typing orders were possible or not; in IE explorer version 8 a dotted circle is shown to represent that the seat is missing for one of the two typing orders so the issue is moot as of version 8. }


Thanks very much, Fantasai, for all your hard work with this.
 
Best,
 
 
--C. E. Whitehead
cewcathar@hotmail.com 







> Date: Wed, 13 Apr 2011 19:36:44 -0700
> From: fantasai.lists@inkedblade.net
> To: www-international@w3.org; public-i18n-core@w3.org; public-i18n-indic@w3.org; public-i18n-cjk@w3.org
> Subject: [css3-text] New Working Draft
> 
> Yesterday the W3C published an updated copy of the CSS Text Level 3
> specification as a Working Draft. This defines many typographic
> features that will soon make their way into a Web browser near you!
> http://www.w3.org/TR/css3-text/

> 
> This module covers, among other things,
> * White space processing
> * Line breaking
> * Justification
> * Text decoration
> 
> The latest revision in particular attempts to classify the scripts
> in Unicode according to their typographic behavior. Unfortunately
> while I am familiar with more scripts than the average person, I am
> unfamiliar with most of them. Any suggestions for corrections and
> additions would be thus much appreciated.
> http://www.w3.org/TR/css3-text/#script-categorization

> 
> The module is derived from some early CSS Internationalization drafts,
> and one of its primary goals is to provide key features needed to
> correctly typeset languages from around the world. We welcome your
> comments and questions on the draft. Input from the Indic, Southeast
> Asian, and Arabic script communities so far has been noticeably missing
> and would be especially appreciated.
> 
> The best way to send feedback is to post to the archived mailing list
> www-style@w3.org: http://lists.w3.org/Archives/Public/www-style/

> with '[css3-text]' and a summary of your comment in the subject line.
> 
> For those of you who want to follow the latest-latest modifications,
> the unofficial editor's draft is here:
> http://dev.w3.org/csswg/css3-text/

> 
> Thanks~
> 
> ~fantasai
> Invited Expert, W3C CSS Working Group
> 
> 
            

Received on Wednesday, 20 April 2011 21:56:08 UTC