- From: Peter Moulder <Peter.Moulder@infotech.monash.edu.au>
- Date: Wed, 08 Apr 2009 15:42:39 +1000
- To: www-svg@w3.org
My reading of the textual description is that strings must be separated by one or more whitespace characters, whereas the EBNF appears to allow zero whitespace characters between strings. I must admit that technically, the EBNF is already correct, insofar as the following possible definitions of list-of-string all match the same inputs: string | string wspchar* list-of-string string | string wspchar+ list-of-string string This is because string is defined simply as a sequence of zero or more char (without quotation), where char can be almost any character including all wspchar characters. However, if the EBNF is to be used not just for validation (specifying the set of legal inputs) but for separating into component strings, then it is necessary to change the definition to one of the following (according to whether empty strings are allowed): spaceless-string | spaceless-string wspchar list-of-string nonempty-spaceless-string | nonempty-spaceless-string wspchar+ list-of-string (The first allows empty strings but requires exactly one separating whitespace character, while the second allows multiple whitepsace characters between strings but requires non-empty strings. Neither of the two allow any strings that contain wspchar.) I've used the name ‘wspchar’ in this message for clarity because the existing definition of wsp in the definition of <list-of-strings> refers to the sequence of whitespace characters rather than to a single whitespace character. It would be nice if wsp were defined the same throughout the SVGMobile12 spec. In the definition of <list-of-Ts> and in paths.html, wsp is defined as a single whitespace character; in shapes.html it is defined as wspchar+, though the rest of that page uses it as if it were a single character, e.g. writing ‘wsp+’. I suggest that wsp be defined as a single character in each case. The definition of <list-of-content-types> has much the same issues as <list-of-strings>: The existing definition allows zero whitespace characters between list elements, whereas in general [and typically] at least one is necessary to know where one begins and another ends. (Content-type is already required to be non-empty (it must contain "/"), so it's fine to allow more than one separator character between list elements.) <content-type> parameter values can be quoted strings that contain spaces. The fact that they're quoted means that the parsing is unambiguous, though it's worthwhile pointing out this possibility so that programmers know that it's not in general safe to blindly skip to the next wsp character. The spec is ambiguous as to whether or not the tokens making up a <content-type> can themselves be separated by whitespace (or even RFC822-style comments); I believe parsing would still be unambiguous/deterministic even if whitespace is used between tokens, but it may be valuable to say something about the issue in the description of <list-of-content-types>. pjrm.
Received on Wednesday, 8 April 2009 05:43:19 UTC