[Bug 11379] New: 2.6.1 - definition of hierarchical URL inconsistent with rfc 3986

http://www.w3.org/Bugs/Public/show_bug.cgi?id=11379

           Summary: 2.6.1 - definition of hierarchical URL inconsistent
                    with rfc 3986
           Product: HTML WG
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: glenn@skynav.com
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


Section 2.6.1 defines a hierarchical URL thus:

"An absolute URL is a hierarchical URL if, when resolved and then parsed, there
is a character immediately after the <scheme> component and it is a U+002F
SOLIDUS character (/)."

However, RFC3986 Section 3 defines all URIs as containing a hierarchical part
as follows:

URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

and, further, does not require the hierarchical part to start with "/". In
particular, it defines hier-part as:

hier-part   = "//" authority path-abempty
                  / path-absolute
                  / path-rootless
                  / path-empty

Which, when expanding these components into their definitions, corresponds to:

hier-part
          = "//" authority
          | "//" authority 1*( "/" segment )
          | "/" [ segment-nz *( "/" segment ) ]
          | segment-nz *( "/" segment )
          | 0<pchar>

Note that the last two alternatives do not start with "/", yet are still
considered a "hierarchical" part by RFC3986. For example, the following URIs
match this syntax, with hier-part mapping to path-rootless:

about:blank
file:foo/bar
urn:example.net:foo:bar

In order to avoid confusion, it may be desirable to use a different term in
HTML5 than "hierarchical URL" in this regard. Alternatively, a note could be
added which distinguishes the defined usage from the like named (but different)
constructs in RFC3986.

I would also note that, in terms of the definitions found in 2.6.1, all
"authority-based URLs" are also "hierarchical URLs". I can't tell if this is
intentional or not, if it is, then perhaps a note indicating this would be
useful.

Regards,
Glenn

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Monday, 22 November 2010 18:40:59 UTC