W3C home > Mailing lists > Public > uri@w3.org > April 2004

Re: grammar fix for path

From: Graham Klyne <GK@ninebynine.org>
Date: Tue, 06 Apr 2004 18:53:11 +0100
Message-Id: <>
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: uri@w3.org

At 18:51 25/03/04 -0800, Roy T. Fielding wrote:

>On Tuesday, February 17, 2004, at 04:01  AM, Ray Merkert wrote:
>>I was just in the middle of looking at IRIs, when I noticed something 
>>strange. It seems the
>>URI 'http://w3c.org:80path1/path2' has become a valid URI, at least 
>>according to the
>>collected BNF grammar in draft-fielding-uri-rfc2396bis-04.txt.
>I have tried various ways of explaining it in the text and finally
>went back to multiple definitions of path, though I hope I've done
>a better job of disambiguating the different cases than I did for
>2396.  I would appreciate it if the grammar-driven parsing experts
>could have a look at
>    http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html (or .xml)
>and see if the new ABNF rules work (I've already tested them with
>the abnf.c tool).

While coding, I notice:

    path          = path-abempty    ; begins with "/" or is empty
                  / path-abs        ; begins with "/" but not "//"
                  / path-noscheme   ; begins with a non-colon segment
                  / path-rootless   ; begins with a segment
                  / path-empty      ; zero characters

is ambiguous, in that there are two productions yielding an empty path:

Suggest:  drop path-empty.

Similarly, I think that path-abs is redundant in  this production, since 
anything that matches path-abs will also match path-abempty.

And similarly, path-noscheme is subsumed by path-rootless.

So the resulting production would be:

    path          = path-abempty    ; begins with "/" or is empty
                  / path-rootless   ; begins with a segment


In appendix A, collected syntax, the production for absolute-URI differs 
from that given in the body text.

Assuming that the body text production (using hier-part) is correct (which 
I think it is) then the 'path' production is itself not used anywhere, 
hence redundant.


With the new syntax, modified as above, I get the following 5 test cases 
differing from my previous version:

URITest> main
### Failure in: 0:Test URIrefs:75:0
expected: True
but got: False
### Failure in: 0:Test URIrefs:75:2
expected: True
but got: False
### Failure in: 0:Test URIrefs:76:0
expected: True
but got: False
### Failure in: 0:Test URIrefs:76:1
expected: True
but got: False
### Failure in: 1:Test URIrefs:1
expected: Just http://user:********@example.org:99aaa/bbb
but got: Nothing
Cases: 529  Tried: 529  Errors: 0  Failures: 5


All of which I think are the intended effect of these changes.


Graham Klyne
For email:
Received on Tuesday, 6 April 2004 15:58:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:07 UTC