Re: [css2.1] [css3-fonts] <family-name> ambiguous - partially quoted names allowed?

A recapitulation of the thread from last year:

Zack asked what tokens exactly were allowed in 'font-family'. I replied
that I thought the intent was to allow either a string or a sequence of
identifiers. Fantasai wondered if that wasn't too limited, seeing that
the existing wording in the spec seemed to assume more.

The thread was interrupted there, but I think she was right. CSS1 didn't
restrict the tokens at all, except for commas and white space. And the
1998 version of CSS2 seems to only contain text that attempts to explain
the syntax, rather than restrict it.

Originally, the design goal was that as much as possible quoting should
be avoided, so that a font name inside a STYLE attribute would not need
a double pair of quotes. I.e.,

    <P STYLE="font-family: Univers 55">

rather than

    <P STYLE="font-family: 'Univers 55'">

But that was never spelled out.

Names consisting of either a single string or a sequence of identifiers
work in the sample of browsers and other programs I tested. But there
doesn't seem to be much consistency in the handling of other tokens.

Nevertheless, given that there are a number of fonts with names that
include a number (Univers 55, 1942 Report, etc.), I'd like to allow at
least those. (We unfortunately can't allow something like "Font No. 1"
without quoting, because the dot would get detached from the "No.")

Zack's original example of 'font-family: "lucida" grande' is better
excluded to avoid mistakes. Opera accepts it, and interprets it (I
believe) as "\"lucida\" grande". Which makes some sense, but is also
confusing. I don't care a lot about other tokens (at-keywords,
parentheses...), as I haven't seen fonts with names like that.

So:

On 24/3/10 4:12, fantasai wrote:
> On 04/22/2009 06:43 AM, Bert Bos wrote:
>> On Tuesday 21 April 2009, Zack Weinberg wrote:
>>> Mozilla has a bug report[1] requesting that this:
>>>
>>>    font-family: "lucida" grande;
>>>
>>> be treated as equivalent to this:
>>>
>>>    font-family: lucida grande;
>>>
>>> There is no formal grammar for<family-name>  and the prose does not
>>> say whether some-but-not-all tokens of a family-name can be quoted.
>>> Matter of fact, it doesn't really explain what the grammar is at all
>>> - it just says that certain punctuation characters must be \-escaped
>>> if they appear unquoted in a family-name.
>>>
>>> My preferred reading of the spec would disallow partial quotation,
>>> but what I really care about as an implementor is that there be an
>>> unambiguous, ideally formal, grammar for every nonterminal.
>>
>> Partially quoted names were not among the original use cases. Maybe it's
>> possible to read the spec as syntactically allowing them, but it
>> certainly doesn't define what they mean.
>>
>> The intent can, I think, be captured by this annotated grammar:
>>
>>    Value: [ [<family-name>  |<generic-family>  ] [ ,<family-name>  |
>>      <generic-family>  ]* ] | inherit;
>>    <generic-family>: serif | sans-serif | cursive | fantasy | monospace;
>>    <family-name>: STRING | IDENT+;  /* see restriction below */
>>
>> where the restriction is that<family-name>  cannot be one of the single
>> IDENTs serif, sans-serif, cursive, fantasy, monospace, inherit, default
>> or initial.
> 
> Proposed changes:
> 
> Replace the paragraphs
>   # If an unquoted font family name ... converted to a single space.
> with the following text:
> 
>   | Font family names must either given quoted as strings_, or unquoted as
>   | a sequence of one or more identifiers_. This means most punctuation
>   | characters and digits at the start of each token must be escaped in
>   | unquoted font family names.
>   |
>   | For example, the following declarations are invalid:
>   |
>   |   font-family: Red/Black, sans-serif;
>   |   font-family: "Lucida" Grande, sans-serif;
>   |   font-family: Ahem!, sans-serif;
>   |   font-family: test@foo, sans-serif;
>   |   font-family: #POUND, sans-serif;
>   |   font-fmaily: Hawaii 5-0, sans-serif;
>   |
>   | If a sequence of identifiers is given as a font family name, the
>   | computed value is the name converted to a string by joining all the
>   | identifiers in the sequence by single spaces.
>   |
>   | To avoid mistakes in escaping, is recommended to quote font family
>   | names that contain white space, digits, or punctuation characters
>   | other than hyphens:
>   |
>   | body { font-family: "New Century Schoolbook", serif }
>   | <BODY STYLE="font-family: '21st Century', fantasy">
>   |
>   | .. `strings`: http://www.w3.org/TR/CSS21/syndata.html#strings
>   | .. `identifier characters`:
> http://www.w3.org/TR/CSS21/syndata.html#characters

I'd like to replace "identifiers" by "identifiers and/or numbers" (three
times) and remove "digits" (once), if possible. Otherwise this new text
seems correct.

> 
> (I would also suggest tucking the whole thing under the <family-name>
> type definition.)
> 
> Note that this changes the prose to be more in line with the Appendix G
> grammar, and thus introduces additional constraints that were not in the
> prose before. Most of them are already honored by a majority of the UAs,
> though. (Arron and I checked.)
> 
>> (The spec is not worded very well with respect to 'default'
>> and 'initial', implying that it is somehow obvious that they are
>> reserved, although it is only obvious if you've read CSS3 Values And
>> Units...)
> 
> (This should have been addressed in the last publication.)
> 
> ~fantasai



Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos                               W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France

Received on Wednesday, 24 March 2010 18:08:31 UTC