Re: CSS 3 Comments from Brady Duga on 2013-11-05 (public-digipub-ig@w3.org from November 2013)

From: Brady Duga <duga@google.com>
Date: Tue, 5 Nov 2013 08:03:39 -0800
To: "Cramer, Dave" <Dave.Cramer@hbgusa.com>
Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CAH_p_eU_SnrFSEwVQuOf5Vjm5c0Rvvn9XjQvA4ZcDi63Dn73aw@mail.gmail.com>
I assume the long word case doesn't have hyphens? I would expect those to
be break opportunities. Instead of forcing a line break, couldn't you use
&shy; or &zwnj; instead? Seems like that would prevent silly breaks on wide
devices (say, a 30" monitor) while still allowing breaks with a small
viewport and maintaining the desired text alignment.


On Mon, Nov 4, 2013 at 6:08 PM, Cramer, Dave <Dave.Cramer@hbgusa.com> wrote:

>   I've added a small number of additions and/or comments below.
>
>  On 10/31/13 4:31 PM, "Brady Duga" <duga@google.com> wrote:
>
>   OK, here are my comments from a review of CSS 3 Text. If we can get any
> others added ASAP, we can try to have these sent out to the CSS WG so they
> can have time to review before TPAC.
>
>  Comments from Brady
>
>  1. It would be great to keep the ‘hanging-punctuation’ property, though
> I understand it is awaiting implementations. What is the timeline here?
> That is, when would an implementation need to appear in order to preserve
> this property?
>
>
>  This is certainly important to us. Antenna House has implemented this,
> and it's on the roadmap for Prince.
>
>
>  2. In section 1.3, after the example:
> "Within this specification, the ambiguous term character is used as a
> friendlier synonym for grapheme cluster. See Characters and Properties for
> how to determine the Unicode properties of a character."
> "A letter for the purpose of this specification is a character belonging
> to one of the Letter or Number general categories in Unicode. [UAX44]"
> If I replace 'character' in the second paragraph with 'grapheme cluster',
> I am not sure I get a reasonable answer. For instance, is U+0067  + U+0308
> a letter? I don't think U+0308 is, does that disqualify the whole cluster?
> Or is this a different use of the term character? Does Unicode define such
> clusters as belonging to all the groups all the code points belong to?
>
>  3. The only place the spec mentions that text-transform should affect
> line breaking is in an informative example (#2), at least that I saw.
> Should this be mentioned in a normative section? Some line breaking changes
> are obvious (for instance, changing the width of the glyphs will alter line
> breaking), but others are more obscure (for instance, transformation to
> full width).
>
>  4. From 5.1, last bullet point:
> "For line breaking in/around ruby, the base text is considered part of the
> same inline formatting context as its surrouding content, but the ruby text
> is not: i.e. line breaking opportunities between the ruby element and its
> surrounding content are determined as if the ruby base were inline and the
> ruby text were not there." [Also, note the typo: surrouding]
> The first part of this sounds like breaks are allowed in a single run of
> base text (difficult, I assume), but the second part sounds like breaks are
> only allowed at boundaries of the ruby element. It seems like, in practice,
> breaks are allowed anywhere in a ruby element a break would be allowed if
> such a location is also a base text boundary.
> For example, consider this snippet:
>
> <p>$B$@(B<ruby>$BBgJ,(B<rt>$B$@$$$V(B</rt>$BF|?t(B<rt>$B$R$+$:(B</rt></ruby>$B$,(B</p>
> From "the base text is considered part of the same inline formatting
> context as its surrouding content, but the ruby text is not", I might
> imagine breaks as though the text were written
> $B$@(B[1]$BBg(B[2]$BJ,(B[3]$BF|(B[4]$B?t(B[5]$B$,(B
> But, this: "i.e. line breaking opportunities between the ruby element and
> its surrounding content" seems to imply this only covers line breaks at the
> boundary of the ruby element itself. In which case I would get:
> $B$@(B[1]$BBgJ,F|?t(B[5]$B$,(B
> However, I would expect the correct breaking would be neither of those,
> but rather:
> $B$@(B[1]$BBgJ,(B[3]$BF|?t(B[5]$B$,(B
> I am not certain how I can interpret the spec to generate those line
> breaks.
>
>  5. In "5.2. Breaking Rules for Punctuation", in this sentence and the
> one below it that is similar:
> "If the content language is Chinese or Japanese, then additionally allow
> (but otherwise forbid) for ‘normal’ and ‘loose’:"
> It's not clear to me what the 'otherwise' applies to - is it the 'normal'
> and 'loose', so it is forbidden in strict when the language is Chinese or
> Japanese? Or does it apply to the language as well, so it is forbidden in
> strict for Chinese and Japanese, and for any value for all other languages?
> If the latter, then the implication is that in eg English, breaks before
>  U+2010 are forbidden. However, the later clarifying note seems to indicate
> that non-CJK text is only affected when the language is Chinese or Japanese.
>
>  6. In "6.1. Hyphenation Control", the sentence: "The UA is therefore
> only required to automatically hyphenate text for which [...]"
> Is it the case that a UA is ever *required* to automatically hyphenate?
> Perhaps this should be weakened to "Therefore, if no language is specified
> or no hyphenation resource is available to the UA for a specified language,
> the UA may choose to treat 'auto' as 'manual'."
>
>
>  Section 6.1 also states, "Conditional hyphenation characters inside a
> word, if present, take priority over automatic resources when determining
> hyphenation opportunities within the word." Is this a strong-enough
> statement? We've seen many cases where a word will hyphenate one character
> away from a soft hyphen.
>
>
>  7. Not really wrong, but the order of property names in the title for
> 6.2 is the opposite of the order just below, in the definition,
> ‘word-wrap’/‘overflow-wrap’ vs overflow-wrap/word-wrap. Just a little weird.
>
>  8. "6.2. Overflow Wrapping", so sayeth Yoda:
> "[...] and grapheme clusters must together stay as one unit." Maybe "stay
> together" instead?
>
>  9. In "7.1. Text Alignment", "text-align: start end" sounds a lot like
> "text-align-last: *", giving special treatment to the first line instead of
> the last line, with less control. Perhaps there should be a separate
> property for controlling the first line alignment, just like there is for
> controlling the last line. Then text-align could become a shorthand. For
> example:
>
>  text-align: center == text-align-first: center, text-align-middle:
> center, text-align-last: auto
> text-align: center right == text-align-first: center, text-align-middle:
> center, text-align-last: right
> text-align: left center right == text-align-first: left,
> text-align-middle: center, text-align-last: right
>
>  This makes the proposed 'text-align: start end' become 'text-align:
> start end end' instead.
> Of course, the down side is this would require two new properties
> ("text-align-first", "text-align-middle"). Not sure if this is worth
> considering at this point, but it seems odd to handle this in different
> ways for different special lines. Perhaps drop 'start end' for now and
> reconsider for level 2?
>
>
>  Sometimes we need to force a line-break inside a paragraph for various
> reasons
> [novelists-sometimes-string-together-dozens-of-words-with-hyphens-leaving-no-natural-break-points].
>  Having text-align-last control this is almost never what we want. In the
> most common case, we want the last line left-aligned and all other lines
> justified, as in most books published in the last five hundred years.
> Separating text-align-middle from text-align-last would be very helpful.
>
>
>  10. What impact do zero-width letters and zero-width word-separators
> have on the inter-word and distribute text-justify values?
>
>  11. I take exception to example 10 in 7.3.5. Both the greedy algorithm
> and the Knuth/Plass algorithm are O(n). What performance metrics are you
> using to determine the relative speed of these algorithms? Additionally,
> Knuth/Plass is easily adapted to other languages, so it applies equally to
> example 11. Perhaps "harder to implement" instead?
>
>  12. "8.1. Word Spacing": Can this property be used to make words
> overlap? That is, are values less than -100% allowed? 'letter-spacing' says
> there may be UA limitations for such things.
>
>  13. letter-spacing says it doesn't apply at the start/end of a line.
> Should there be similar text be in word-spacing?
>
>  14. At the end of word-spacing (just after example 13), the text
> "Word-separator characters include [...]" - is this considered an
> exhaustive list? If so, this should be made clear, otherwise some sort of
> guidelines for deciding what else might be a word-separator would be useful.
>
>  15. In "8.2. Tracking", just after example 14: "[...] to the innermost
> element element that contains the two characters [...]"
> Just one element?
>
>  16. And just after example 15: "Letter-spacing ignores zero-width
> characters (such as those from the Unicode Cf category)." Does this mean
> characters that are defined to be zero-width, or characters whose width
> might be zero? For instance, given:
>
>  span.zero { display: inline-block; width: 0; }
> p {letter-spacing: 1em;}
>
>  <p>a<span class="zero">b</span>c</p>
>
>  Would this be viewed as "a bc" (1em after 'a', zero-width 'b', 1em after
> end of 'b', 'c') or as 'a' with 'b' and 'c' on top of each other 1em later?
>
>
>  We are disappointed that maximum and minimum values for word-spacing and
> letter-spacing were removed in this draft. Better control over
> justification is a key requirement for us.
>
>
>
>  17. In "9.1. First Line Indentation", it is not clear to me what
> 'each-line' is doing. Does this simply make the indent of lines after hard
> line breaks indent, and they wouldn't otherwise? If so, perhaps it should
> say "In addition to the first line of a block container each line after a
> forced line break are also affected. Lines after a soft wrap break are
> still not affected." Or maybe there is something else going on I just don't
> understand.
>
>
>  I found this section a bit confusing. Perhaps examples of "hanging" and
> "each-line" would be helpful.
>
>
>  18. In "9.2. Hanging Punctuation", the 'Animatable:' table entry has a
> spurious gt ('>').
>
>  19. Appendix A, steps 5.iv and 5.v - how do you do letter and word
> spacing without knowing the font in use? For instance, a percent value for
> letter spacing depends on the advance measure of the character, which will
> depend on the current font.
>
>  19. Appendix B:
> "[...]  is to help UA developers to implement default stylesheet [...]" -
> 'a default stylesheet'? Or maybe 'the default stylesheet'? Or even 'default
> stylesheets'?
>
>
>  Thanks,
>
>  Dave
>
> ------------------------------
> This may contain confidential material. If you are not an intended
> recipient, please notify the sender, delete immediately, and understand
> that no disclosure or reliance on the information herein is permitted.
> Hachette Book Group may monitor email to and from our network.
>
Received on Tuesday, 5 November 2013 16:04:11 UTC