W3C home > Mailing lists > Public > uri@w3.org > April 2004

Re: grammar fix for path

From: Graham Klyne <GK@ninebynine.org>
Date: Tue, 06 Apr 2004 18:53:11 +0100
Message-Id: <5.1.0.14.2.20040406185131.02e82d98@127.0.0.1>
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: uri@w3.org

At 18:51 25/03/04 -0800, Roy T. Fielding wrote:

>On Tuesday, February 17, 2004, at 04:01  AM, Ray Merkert wrote:
>>I was just in the middle of looking at IRIs, when I noticed something 
>>strange. It seems the
>>URI 'http://w3c.org:80path1/path2' has become a valid URI, at least 
>>according to the
>>collected BNF grammar in draft-fielding-uri-rfc2396bis-04.txt.
>
>I have tried various ways of explaining it in the text and finally
>went back to multiple definitions of path, though I hope I've done
>a better job of disambiguating the different cases than I did for
>2396.  I would appreciate it if the grammar-driven parsing experts
>could have a look at
>
>    http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html (or .xml)
>
>and see if the new ABNF rules work (I've already tested them with
>the abnf.c tool).

While coding, I notice:

    path          = path-abempty    ; begins with "/" or is empty
                  / path-abs        ; begins with "/" but not "//"
                  / path-noscheme   ; begins with a non-colon segment
                  / path-rootless   ; begins with a segment
                  / path-empty      ; zero characters

is ambiguous, in that there are two productions yielding an empty path:
    path-abempty
and
    path-empty

Suggest:  drop path-empty.

Similarly, I think that path-abs is redundant in  this production, since 
anything that matches path-abs will also match path-abempty.

And similarly, path-noscheme is subsumed by path-rootless.

So the resulting production would be:

    path          = path-abempty    ; begins with "/" or is empty
                  / path-rootless   ; begins with a segment

...

In appendix A, collected syntax, the production for absolute-URI differs 
from that given in the body text.

Assuming that the body text production (using hier-part) is correct (which 
I think it is) then the 'path' production is itself not used anywhere, 
hence redundant.

...

With the new syntax, modified as above, I get the following 5 test cases 
differing from my previous version:

[[
URITest> main
### Failure in: 0:Test URIrefs:75:0
test_isURIReference:http://foo.org:80Path/More
expected: True
but got: False
### Failure in: 0:Test URIrefs:75:2
test_isAbsoluteURI:http://foo.org:80Path/More
expected: True
but got: False
### Failure in: 0:Test URIrefs:76:0
test_isURIReference:::
expected: True
but got: False
### Failure in: 0:Test URIrefs:76:1
test_isRelativeURI:::
expected: True
but got: False
### Failure in: 1:Test URIrefs:1
testURIRefComponents:http://user:pass@example.org:99aaa/bbb
expected: Just http://user:********@example.org:99aaa/bbb
but got: Nothing
Cases: 529  Tried: 529  Errors: 0  Failures: 5

URITest>
]]

All of which I think are the intended effect of these changes.

#g


------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact
Received on Tuesday, 6 April 2004 15:58:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:07 UTC