Re: SVGMobile12/types.html#DataTypeListOfString issue

Hi Peter.

Thanks for the bug report.

Peter Moulder:
> My reading of the textual description is that strings must be
> separated by one or more whitespace characters, whereas the EBNF
> appears to allow zero whitespace characters between strings.
> 
> 
> I must admit that technically, the EBNF is already correct, insofar
> as the following possible definitions of list-of-string all match the
> same inputs:
> 
>    string | string wspchar* list-of-string
>    string | string wspchar+ list-of-string
>    string
> 
> This is because string is defined simply as a sequence of zero or
> more char (without quotation), where char can be almost any character
> including all wspchar characters.
> 
> 
> However, if the EBNF is to be used not just for validation (specifying
> the set of legal inputs) but for separating into component strings,
> then it is necessary to change the definition to one of the following
> (according to whether empty strings are allowed):
> 
>   spaceless-string | spaceless-string wspchar list-of-string
>   nonempty-spaceless-string | nonempty-spaceless-string wspchar+ list-of-string
> 
> (The first allows empty strings but requires exactly one separating
> whitespace character, while the second allows multiple whitepsace
> characters between strings but requires non-empty strings. Neither of
> the two allow any strings that contain wspchar.)

Indeed, we would like to use the grammar for separating the list into
its constituent items.

> I've used the name ‘wspchar’ in this message for clarity because
> the existing definition of wsp in the definition of <list-of-strings>
> refers to the sequence of whitespace characters rather than to a
> single whitespace character. It would be nice if wsp were defined
> the same throughout the SVGMobile12 spec. In the definition of
> <list-of-Ts> and in paths.html, wsp is defined as a single whitespace
> character; in shapes.html it is defined as wspchar+, though the rest
> of that page uses it as if it were a single character, e.g. writing
> ‘wsp+’. I suggest that wsp be defined as a single character in
> each case.

Yes I think it would be best if the use of grammar symbols were
consistent across the specification.

> The definition of <list-of-content-types> has much the same issues
> as <list-of-strings>: The existing definition allows zero whitespace
> characters between list elements, whereas in general [and typically]
> at least one is necessary to know where one begins and another ends.
> (Content-type is already required to be non-empty (it must contain
> "/"), so it's fine to allow more than one separator character between
> list elements.) <content-type> parameter values can be quoted strings
> that contain spaces. The fact that they're quoted means that the
> parsing is unambiguous, though it's worthwhile pointing out this
> possibility so that programmers know that it's not in general safe to
> blindly skip to the next wsp character. The spec is ambiguous as to
> whether or not the tokens making up a <content-type> can themselves
> be separated by whitespace (or even RFC822-style comments); I believe
> parsing would still be unambiguous/deterministic even if whitespace is
> used between tokens, but it may be valuable to say something about the
> issue in the description of <list-of-content-types>.

I’ll make sure to mention this (perhaps by including the grammar from
the relevant RFC that defines the syntax for Internet media types) in
the definition of <list-of-content-types>.


I already have http://www.w3.org/Graphics/SVG/WG/track/actions/2391 to
look at ensuring list syntaxes between SVG 1.1 and SVG Tiny 1.2 are
compatible (since they appear not to be, at the moment) and to verify
them against actual implementations.  As part of that I’ll address your
comments above.

Thanks,

Cameron

-- 
Cameron McCormack ≝ http://mcc.id.au/

Received on Monday, 20 April 2009 09:57:17 UTC