W3C home > Mailing lists > Public > uri@w3.org > January 2006

following syntax of old schemes

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Wed, 25 Jan 2006 13:10:10 +0000
Message-ID: <43D778B2.5090603@hpl.hp.com>
To: uri@w3.org


A (last?) comment on
http://tools.ietf.org/html?draft=draft-hansen-2717bis-2718bis-uri-guidelines-07.txt


I wonder whether this paragraph needs strengthening
[[
All URI scheme
    specifications MUST define their own syntax such that all strings
    matching their scheme-specific syntax will also match the <absolute-
    URI> grammar described in Section 4.3 of RFC 3986.
]]

For example, to this:
[[
All URI scheme
    specifications MUST define their own syntax such that all strings
    matching their scheme-specific syntax will also match the <absolute-
    URI> grammar described in Section 4.3 of RFC 3986;
*** NEW TEXT below
and such that for all such strings, every subcomponent defined by the
scheme, lies wholly within only one of the authority, path, query
or fragment components defined in RFC 3986.
]]

If so, there may be a desire to have consequential changes to
[[
New URI schemes SHOULD reuse the common URI components of RFC 3986
    for the definition of hierarchical naming schemes.  However, if there
    is a strong reason for a URI scheme to not use the hierarchical
    syntax, then the new scheme definition SHOULD follow the syntax of
    previously registered schemes.
]]

Two previously registered schemes that break my suggested text are: file 
and ftp. The definitions of both file and ftp suffer from an 
inconsistency with RFC 3986 concerning the use of ? and ;. Consider

ftp://example.org/foo?bar/ba?z;type=d

Parsing according to the defn of ftp (RFC 1738)

ftpurl = "ftp://" login [ "/" fpath [ ";type=" ftptype ]]
fpath = fsegment *[ "/" fsegment ]
fsegment = *[ uchar | "?" | ":" | "@" | "&" | "=" ]
ftptype = "A" | "I" | "D" | "a" | "i" | "d"


we get:

ftp://
   example.org
   / foo?bar / ba?z
   type=d


whereas parsing with RFC 3986 we get

ftp:
    // example.org
    / foo
    ? bar/ba?z;type=d

The different treatment of ? presumably can result in interoperational 
failures (not that I have tried to find one).

It seems to me that one of the additions that RFC 3986 offers over 2396 
was that 'opaque' URIs are fitted into the same syntax, and that all 
URIs MUST follow the generic syntax (noting the path-rootless option, 
for say mailto URIs, and URNs). Hence my suggested text.

Jeremy
Received on Wednesday, 25 January 2006 13:23:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:09 UTC