Re: [css-syntax] Ready for wide review, FPWD request coming soon

On Fri, May 17, 2013 at 11:12 AM, Zack Weinberg <zackw@panix.com> wrote:
> On 2013-05-09 6:37 PM, Tab Atkins Jr. wrote:
>>
>> Hey all!  I believe that Syntax is now at a state where it is both
>> stable and correct, and ready to review.
>
>
> I plan to go through this draft line-by-line on Sunday, when I have a long
> plane flight.  Here are some high-level points for now.

Excellent, thanks!

> * Regarding recursive-descent-style tokenization and removal of
>   pushback, you were skeptical that this would be easier to read.
>   Would you be interested in me attempting to rewrite section 4 with
>   those changes, to see how it goes?  It would be pretty major so I
>   don't want to do it if you're not at least curious whether it would
>   be better.

A small section would suffice.  Could you just try rewriting the
number/percentage/dimension parsing?  That's probably the most complex
set of interlocking states.

> * 3.2.1 "Preprocessing the input stream": I still think that *either*
>   FORM FEED should be converted to LINE FEED here, *or* all four
>   possible newline sequences (LF, CR, CR LF, and FF) should be listed
>   as such in 4.3 (tokenizer definitions) and the only preprocessing
>   step should be to convert U+0000 to U+FFFD.
>
>   Last time around I was advocating for the latter option, but I've
>   thought about it some more and I think it will be clearer to do the
>   former.

I don't have a problem with the former.  Done.

> * 4.2 "transform function whitespace flag": I reiterate that this wart
>   does not belong in this spec *at all*; it belongs in the value grammar
>   for the SVG transform attribute.  You said back in February that
>   this would be "annoying to express at that end", but this I
>   disbelieve.  It should be as simple as
>
>     transform-atom: transform-function transform-args ')'  ;
>     transform-function: FUNCTION | IDENT '(' ;
>
>   in yacc-ese.  (Which spec exactly would have to change if this wart
>   were removed?  It would help me understand your position if I had
>   read that.)

Yeah, I guess you're right.  Removed.

> Other, mostly minor issues:
>
> * throughout: I support decapitalizing all the token names, but I
>   think we need *some* sort of typographical convention to distinguish
>   them from running text, particularly as some of the names are
>   punctuation and others aren't.  I would suggest sans-serif, but the
>   bulk of the document is already sans-serif so that won't work.
>   Italics and boldface are already in use for other purposes.  Maybe
>   quoting with [LEFT, RIGHT]-POINTING ANGLE BRACKET?  (e.g. 〈ident〉)

Hm, interesting.  I support using some sort of delimiter, and those
angle brackets look pretty good.  This'll be a good bit of work to do.
 ^_^

> * throughout: Unlike other Unicode character names, U+FFFD REPLACEMENT
>   CHARACTER should *not* be followed by the literal character in
>   parens (�).

Why?

> * 3. "Conformance checkers are not required to recover from parse
>   errors": ... but if they do, they should be required to obey the
>   same recovery rules as user agents.

Propably implicit, but I've gone ahead and specified it.

> * 4. Didn't we resolve to remove the "id"/"unrestricted" distinction
>   for HASH tokens from Selectors, and thus also from here?  (I could
>   be misremembering.)

Not yet!  I'm supposed to be doing a google-data search to find hashes
in selectors that aren't valid idents.  As long as I don't find
anything damning, though, we are dropping the distinction.

> * 4. Unicode-range tokens may need a "valid" flag.  I need to
>   cross-check the code in Gecko against the algorithm in this spec
>   carefully, but the definition of UNICODE-RANGE in CSS2.1 included
>   several forms that were semantically invalid.

The parser in Syntax ended up only accepting valid unicode ranges
(except that it does, technically, allow for ranges where the min is
higher than the max).  This is more restrictive than CSS 2.1, but it
only fails to cover things that were invalid in the first place.

> * 4.6 "changes from CSS 2.1":  Should mention the column token.  I
>   still think we need to hear from dbaron about the change to bad-uri.

Added, thanks.

> * 5 "declaration": Is now maybe the right time to generalize
>   !important?  (presumably to ! <list of component values>)

I'm comfortable with either doing it now, or waiting until we actually
add a second thing.  I'll log an issue for now.  (Your suggestion for
syntax works better than what I was thinking of, by the way.)

> * 5 "recognized at-rule names": Didn't we agree to get rid of this?

Oh man, this is totally unnecessary now.  Killed with prejudice.

> * 6 "an+b microsyntax": Yay for specifying this in terms of normal
>   tokens, but I'm going to have to try to implement it in Gecko before
>   I can tell you if it's workable.

Cool.  Note the one minor change from previous specifications: "+ n"
(and the variant with a B specified) are now allowed, with the space
between the + and the n, just because they're separate tokens and
whitespace is allowed between tokens.  This should affect
approximately no one.

~TJ

Received on Friday, 17 May 2013 20:17:08 UTC