Re: grammars and parsing for regular and presentation attributes

On Jan 18, 2013, at 5:08 PM, Cameron McCormack <cam@mcc.id.au> wrote:

> We have already resolved (and it's also in our list of requirements for 
> SVG 2) to make the spec support presentation attribute parsing case 
> insensitively, e.g. fill="Red" should work just like style="fill:Red" would.
> 
> I couldn't see an issue in the tracker about allowing all CSS syntax in 
> presentation attributes, e.g. fill="/**/red", although I'm sure we've 
> brought it up before.  IE and Chrome both support this, while Opera and 
> Firefox do not.  (Chrome also supports fill="re\64", while IE does not.) 
>  I think it's a natural progression to parse these attributes entirely 
> with the CSS parser.  What are people's current thoughts on this?
> 
> I am also wondering what to do about the parsing of non-presentation 
> attributes.  I find it strange that for example the definition of both a 
> <rect> element's x="" attribute and say the stroke-width property refer 
> to the <length> type, given that we currently have non-presentation 
> attribute parsing defined using EBNF grammars while properties use the 
> CSS grammar syntax.  I want to remove the definitions of the data types 
> like <length> in types.html and for properties at least, reference 
> css3-values, but the question of what non-presentation attributes should 
> refer to remains.
> 
> I don't think we want to duplicate the CSS <length> syntax (which 
> includes calc() expressions now) in EBNF.  Maybe we can still utilise 
> the CSS parser and invoke it with some flags that disable escapes and 
> comments?
> 
> Should <rect x=" 10"> be valid by the way?

First, webkit and IIRC Firefox as well use their CSS Parser for presentation attributes anyway. That is why you see more restrictions on the "x" attribute, then on "text-size". First is an SVG attribute, the second is a presentation attribute. However, CSS syntax can be a lot more restrictive then SVG syntax too, as seen on the grammar for polygon, paths and transforms. For paths and polygon we did not even clarify if we want to support unit types. I would be really careful about a generalization across all attributes.

> 
> And have we decided to allow calc() expressions in lengths in 
> non-presentation attributes, like <rect x="calc(10px + 20%)"> (even not 
> considering the general plan to convert attributes like this to properties)?

I am not sure if we decided to support calc() as function yet. We definitely don't support "!important" for example. Can we concentrate on moving attributes to presentation attributes first?

> 
> 
> I was wondering if we could eliminate the EBNF in the spec entirely and 
> rely only on CSS grammar syntax, but that's probably not feasible, at 
> least for complicated attributes like <path d="">.  I think what we need 
> is a defined way for EBNF grammars to refer to CSS grammar 
> non-terminals.  (We could visually differentiate EBNF and CSS grammar 
> non-terminals to try to avoid confusion in how they're parsed.)

Wouldn't it end up in exactly the separation between CSS and SVG syntax? Note that I needed to do it for the "transform" attributes and co. on CSS Transforms already.

> 
> Let's take <text x=""> as an example.  That's a 
> white-space-or-comma-separated list of <length> values.  If we use EBNF 
> for the attribute as a whole, we could write:
> 
>   list-of-lengths ::= <length> | <length> comma-wsp list-of-lengths
>   comma-wsp ::= (wsp+ ","? wsp*) | ("," wsp*)
>   wsp ::= (#x20 | #x9 | #xD | #xA)
> 
> So angle-bracketed non-terminals reference CSS grammar symbols, and bare 
> identifiers refer to EBNF non-terminals rather than being literals as in 
> CSS grammar syntax.

Using bracket value types seems reasonable to minify the syntax.

> 
> (We should also consider aligning our set of white space characters with 
> those in CSS, where CR isn't supported, remembering that XML will 
> normalize CRs to LFs when parsing.)

What kind of document will a standalone SVG document be? We talk about removing XML bindings, but don't clarify this part. If it is not XML anymore, what will it be? What kind of parser is supposed to parse an SVG document?

> 
> There might be some trickiness with white space parsing within the 
> <length>, as I think CSS parsing will normally consume any trailing 
> white space, which we might not want to do here.
> 
> What do people think (and Tab in particular, since he's the css3-syntax 
> editor)?
> 

Note that I am open for discussing using CSS Syntax on presentation attributes, but in a less restrictive mode. Maybe the "quirks mode" started by WHAT WG?

Greetings,
Dirk

Received on Saturday, 19 January 2013 02:12:35 UTC