W3C home > Mailing lists > Public > www-style@w3.org > February 2010

[cssom] comments on value serialization rules

From: L. David Baron <dbaron@dbaron.org>
Date: Wed, 17 Feb 2010 13:21:14 -0500
To: www-style@w3.org
Message-ID: <20100217182114.GA2902@pickering.dbaron.org>
Various comments on the value serialization rules in:
  http://dev.w3.org/csswg/cssom/#serializing-css-values
follow.


For <angle>, <frequency>, <length>, <resolution>, and <time>, Gecko
currently preserves the unit the author specified.  I'm ok with
normalizing to a single unit, although I worry about two things:
 * compatibility issues
 * choice of what the canonical unit is
In particular, I suspect serializing '12pt' to '4.2333mm' might lead
to confusion, though maybe it would be good that authors learn that
'pt' are physical units.


The escaping rules presented for <string> and <uri> have bugs in
that they produce values that don't round-trip correctly (e.g.,
values with \n or \r in them, or values with \ in them).  I'd
propose fixing that the way I did in Gecko, and also generally
making the escaping rules for <identifier> more similar to those for
<string> and <uri>, by making them as follows:

  Define "serialize as a codepoint escape" as:  serialize the
  character as: a literal "\" (U+005C), followed by the Unicode code
  point as the smallest possible number of hexadecimal digits in the
  range 0-9 a-f (U+0030 to U+0039 and U+0061 to U+0066) to represent
  the code point in base 16, followed by a space (U+0020).

  Define "serialize as a character escape" as: serialize the
  character as: a literal "\" (U+005C), followed by the character.

  Define "escape as an identifier" as a string represented by the
  concatenation of, for each character:
   * if the character is in the range U+0000 to U+001F, the
     character serialized as a codepoint escape
   * if the character is the first character and is in the range 0-9
     (U+0030 to U+0039), the character serialized as a codepoint
     escape
   * if the character is the second character and is in the range
     0-9 (U+0030 to U+0039) and the first character is a "-"
     (U+002D), the character serialized as a codepoint escape
   * if the character is the second character and is "-" (U+002D)
     and the first character is "-" too, then the character
     serialized as a character escape
   * if the character is not handled by one of the above rules and
     is greater than or equal to U+0080, is "-" (U+002D) or "_"
     (U+005F), or is in one of the ranges 0-9 (U+0030 to U+0039),
     A-Z (U+0041 to U+005A), or a-z (U+0061 to U+007A), the
     character itself
   * otherwise, the character serialized as a character escape

  Define "escape as an string" as a string represented by the
  concatenation of, for each character:
   * if the character is in the range U+0000 to U+001F, the
     character serialized as a codepoint escape
   * if the character is '"' (U+0022), "'" (U+0027),  or '\'
     (U+005C), the character serialized as a character escape
   * otherwise, the character itself

These rules also have the advantages that they minimize the use of
codepoint escapes (generally quite ugly due to the
space-as-terminator) to ASCII nonprintables and escaping of digits,
and that they escape all ASCII nonprintables (which could otherwise
be quite confusing to read).

Then <uri> and <string> can refer to "escape as a string" and
<identifier> can refer to "escape as an identifier".


Also, <uri> should end with '")' rather than ')"'.


For <color>, what Gecko does, I believe, is that it handles system
colors, color keywords, and 'currentColor' like <keyword>, but
handles 'transparent', 'rgb()', 'rgba()', 'hsl()', 'hsla()', '#rgb',
and '#rrggbb' as color values which are serialized as follows:
 * 'transparent' when R, G, B, and A components are all 0
 * as 'rgb()' when A component is 255
 * as 'rgba()' otherwise
I'm not particularly committed to keeping it that way, though.
However, I'm somewhat concerned about the idea of switching to
#rrggbb notation since rgb() can handle out-of-sRGB values whereas
#rrggbb cannot.  (Gecko doesn't currently implement that, but I'd
like to.)


Also, the title of the section following is misspelled
("Declaraton" needs an i).

-David

-- 
L. David Baron                                 http://dbaron.org/
Mozilla Corporation                       http://www.mozilla.com/
Received on Wednesday, 17 February 2010 18:21:45 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:24 GMT