Re: [css-text] comments from DPub IG (Fwd) from fantasai on 2013-11-12 (www-style@w3.org from November 2013)

From: fantasai <fantasai.lists@inkedblade.net>
Date: Mon, 11 Nov 2013 23:20:57 -0800
To: www-style@w3.org, public-digipub-ig@w3.org, Brady Duga <duga@google.com>
Message-ID: <5281D6D9.2040706@inkedblade.net>
Markus Gylling wrote:
> Dear Bert,
> please find below comments on css3-text LC from the Digital
> Publishing IG. We will of course be available to answer any
> follow-up questions that the CSS WG might have.

Dear Digital Publishing IG,
Please in the future just send a message to the www-style
mailing list and CC yourselves. This will let us respond
directly and thread a cross-posted discussion with you. :)

> --- begin ---
> 1. It would be great to keep the ‘hanging-punctuation’ property,
> though I understand it is awaiting implementations. What is the
> timeline here? That is, when would an implementation need to
> appear in order to preserve this property?

The timeline would be, however long it takes for most of the rest
of the spec to be implemented. :) If it's one of the last things
holding us up, we'll drop it; otherwise not.

> 2. In section 1.3, after the example:
> "Within this specification, the ambiguous term character is used
> as a friendlier synonym for grapheme cluster. See Characters and
> Properties for how to determine the Unicode properties of a
> character."
> "A letter for the purpose of this specification is a character
> belonging to one of the Letter or Number general categories in
> Unicode. [UAX44]"
> If I replace 'character' in the second paragraph with 'grapheme
> cluster', I am not sure I get a reasonable answer. For instance,
> is U+0067  + U+0308 a letter? I don't think U+0308 is, does that
> disqualify the whole cluster? Or is this a different use of the
> term character? Does Unicode define such clusters as belonging
> to all the groups all the code points belong to?

Ah, that's referring to some text that was recently removed from
Writing Modes. Here's the old version:
   http://www.w3.org/TR/2012/WD-css3-writing-modes-20121115/#character-properties
   #  For the purposes of CSS Writing Modes, the properties of a
   #  grapheme cluster are given by its base character—except [...]

I'll have to restore that text, will follow-up on that with a link...

> 3. The only place the spec mentions that text-transform should
> affect line breaking is in an informative example (#2), at least
> that I saw. Should this be mentioned in a normative section? Some
> line breaking changes are obvious (for instance, changing the
> width of the glyphs will alter line breaking), but others are
> more obscure (for instance, transformation to full width).

This is implied by the ordering in Appendix A:
   http://dev.w3.org/csswg/css-text/#order

However I can add a note to clarify that point to the text-transform
section.

> 4. From 5.1, last bullet point:
> "For line breaking in/around ruby, [...]
> However, I would expect the correct breaking would be neither of those, but rather:
> だ[1]大分[3]日数[5]が
> I am not certain how I can interpret the spec to generate those line breaks.

Good point. This sentence was written before we had a reasonable
draft of the ruby spec to refer to, so I will replace this with
a pointer to that spec, which has a much more involved discussion
of ruby line-breaking:
   http://www.w3.org/TR/css-ruby-1/#line-breaks

I'll drop a link to the updated text once I get in the edits. :)

> 5. In "5.2. Breaking Rules for Punctuation", in this sentence
> and the one below it that is similar:
> "If the content language is Chinese or Japanese, then additionally
> allow (but otherwise forbid) for ‘normal’ and ‘loose’:"
> It's not clear to me what the 'otherwise' applies to - is it the
> 'normal' and 'loose', so it is forbidden in strict when the language
> is Chinese or Japanese? Or does it apply to the language as well,
> so it is forbidden in strict for Chinese and Japanese, and for any
> value for all other languages? If the latter, then the implication
> is that in eg English, breaks before  U+2010 are forbidden. However,
> the later clarifying note seems to indicate that non-CJK text is
> only affected when the language is Chinese or Japanese.

It applies to "is Chinese or Japanese", so it would be
   If the content language is Chinese or Japanese then allow [...]
   If the content language is not Chinese or Japanese then forbid [...]

> 6. In "6.1. Hyphenation Control", the sentence: "The UA is therefore
> only required to automatically hyphenate text for which [...]"
> Is it the case that a UA is ever *required* to automatically hyphenate?

I believe not.

> Perhaps this should be weakened to "Therefore, if no language is
> specified or no hyphenation resource is available to the UA for a
> specified language, the UA may choose to treat 'auto' as 'manual'."
>
> Section 6.1 also states, "Conditional hyphenation characters inside
> a word, if present, take priority over automatic resources when
> determining hyphenation opportunities within the word." Is this a
> strong-enough statement? We've seen many cases where a word will
> hyphenate one character away from a soft hyphen.

Hm, interesting. I'll update the text to say that automatic hyphenation
points within the word are ignored when it contains &shy;

> 6.1 In example 8, there is an extra nun in نوشتنن, at the end. I
> think it should be نوشتن.

Fixed.

> 7. Not really wrong, but the order of property names in the title
> for 6.2 is the opposite of the order just below, in the definition,
> ‘word-wrap’/‘overflow-wrap’ vs overflow-wrap/word-wrap. Just a
> little weird.

Fixed.

> 8. "6.2. Overflow Wrapping", so sayeth Yoda:
> "[...] and grapheme clusters must together stay as one unit."
> Maybe "stay together" instead?

Fixed.

> 9. In "7.1. Text Alignment", "text-align: start end" sounds a
> lot like "text-align-last: *", giving special treatment to the
> first line instead of the last line, with less control. Perhaps
> there should be a separate property for controlling the first
> line alignment, just like there is for controlling the last line.
> Then text-align could become a shorthand. For example:
>
> text-align: center == text-align-first: center,
> text-align-middle: center, text-align-last: auto
> [...]

I think I will re-raise this to the WG, will reply with an update.

> Sometimes we need to force a line-break inside a paragraph for
> various reasons [novelists-sometimes-string-together-dozens-
>of-words-with-hyphens-leaving-no-natural-break-points].  Having
> text-align-last control this is almost never what we want. In
> the most common case, we want the last line left-aligned and
> all other lines justified, as in most books published in the
> last five hundred years. Separating text-align-middle from
> text-align-last would be very helpful.

For this case what you want is to insert zero-width spaces at the
acceptable breaking points, not a forced break at the one that's
closest to the edge. This is particularly important if it's a
reflowable document, rather than a static-printed one. :)
In general, if you have a forced break in the middle of a block,
you do want it to behave like the last line.

So, I'm going for no change on this point, unless you a see a
natural reason / place in the spec for us to note the existence
of zwsp!

> 10. What impact do zero-width letters and zero-width word-separators
> have on the inter-word and distribute text-justify values?

Nothing, see 'letter-spacing'.
   # when space is distributed an expansion opportunity between
   # two characters, it is applied under the same rules as for
   # ‘letter-spacing’.
...
   # Letter-spacing ignores zero-width characters (such as those
   # from the Unicode Cf category).

> 11. I take exception to example 10 in 7.3.5. Both the greedy
> algorithm and the Knuth/Plass algorithm are O(n). What
> performance metrics are you using to determine the relative
> speed of these algorithms? Additionally, Knuth/Plass is easily
> adapted to other languages, so it applies equally to example 11.
> Perhaps "harder to implement" instead?

Done.

> 12. "8.1. Word Spacing": Can this property be used to make words
> overlap? That is, are values less than -100% allowed? 'letter-spacing'
> says there may be UA limitations for such things.

Yes, good point. I've shifted the sentence about negative values
out of <length> so it applies equally to <percentage>.

> 13. letter-spacing says it doesn't apply at the start/end of a
> line. Should there be similar text be in word-spacing?

No, because it is applied to word-separator character (effectively
making it wider), and so should be consistently applied to all
such characters on the line.

> 14. At the end of word-spacing (just after example 13), the text
> "Word-separator characters include [...]" - is this considered
> an exhaustive list? If so, this should be made clear, otherwise
> some sort of guidelines for deciding what else might be a
> word-separator would be useful.

Added some guidelines. The list is intended to be exhaustive, but
I am not sure that it is (and certainly won't be in some future
edition of Unicode).

> 15. In "8.2. Tracking", just after example 14: "[...] to the
> innermost element element that contains the two characters [...]"
> Just one element?

Yes. Why?

> 16. And just after example 15: "Letter-spacing ignores zero-width
> characters (such as those from the Unicode Cf category)." Does
> this mean characters that are defined to be zero-width, or
> characters whose width might be zero?

Those that are defined to be zero-width. Clarified as:
   # Letter-spacing ignores invisible zero-width formatting characters
   # (such as those from the Unicode Cf category). Spacing must be
   # added as if those characters did not exist in the document.

Let me know if that's sufficiently clear, or if you have some
suggestions for further improvement.

> We are disappointed that maximum and minimum values for word-spacing
> and letter-spacing were removed in this draft. Better control over
> justification is a key requirement for us.

I can't really do anything about that at this point. :/ For future
levels of CSS Text, it would be important to have detailed feedback
on this point, including how such controls would fit in with various
possible and desired justification algorithms and how a given set of
settings prepared by an author would translate across various systems,
since these concerns are what prompted the removal.

> 17. In "9.1. First Line Indentation", it is not clear to me what
> 'each-line' is doing. Does this simply make the indent of lines
> after hard line breaks indent, and they wouldn't otherwise?

Yep.

> If so, perhaps it should say "In addition to the first line of a
> block container each line after a forced line break are also
> affected. Lines after a soft wrap break are still not affected."
> Or maybe there is something else going on I just don't understand.

Ok, I tried to tweak the wording a bit so it now says
   Indentation affects the first line of each block container
   and each line after a forced line break (but not lines after
   a soft wrap break).

I think what you're missing is the distinction between "first
formatted line" and "first line of a block container".

> I found this section a bit confusing. Perhaps examples of "hanging"
> and "each-line" would be helpful.

Okay, I will add some.

> 18. In "9.2. Hanging Punctuation", the 'Animatable:' table entry
> has a spurious gt ('>').

Fixed.

> 19. Appendix A, steps 5.iv and 5.v - how do you do letter and
> word spacing without knowing the font in use? For instance, a
> percent value for letter spacing depends on the advance measure
> of the character, which will depend on the current font.

Swapped spacing and glyph selection. Hopefully it still makes
sense for other interactions...

> 20. Appendix B:
> "[...]  is to help UA developers to implement default stylesheet
> [...]" - 'a default stylesheet'? Or maybe 'the default stylesheet'?
> Or even 'default stylesheets'?

Fixed.

~fantasai
Received on Tuesday, 12 November 2013 07:22:15 UTC