- From: Sam Ruby <rubys@intertwingly.net>
- Date: Mon, 09 Aug 2004 08:54:33 -0400
- To: uri@w3.org
- CC: Atom WG <atom-syntax@imc.org>
Paul Hoffman / IMC wrote:
>
> Greetings again. In the discussion of PaceCanonicalIds, some questions
> were brought up about what draft-fielding-uri-rfc2396bis really says
> about canonicalization. Section 6 of that draft says a few different
> things. At the URI BOF at the IETF meeting last week, I volunteered the
> Atompub WG to be reviewers for that document. :-)
>
> So, all you canonicalization folks: please review the document,
> particularly section 6, and send comments to uri@w3.org (archived at
> <http://lists.w3.org/Archives/Public/uri/>). Just like on this list, if
> you see something you consider wrong, suggest new text. Your comments
> will be considered for the soon-to-happen IETF last call on the document.
Excerpts from sections 3 "Syntax Components":
foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragment
authority = [ userinfo "@" ] host [ ":" port ]
userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
Excerpt from section 6.3 "Canonical Form":
# Always provide the URI scheme in lowercase characters.
# Always provide the host, if any, in lowercase characters.
# Only perform percent-encoding where it is essential.
# Always use uppercase A-through-F characters when percent-encoding.
# Prevent dot-segments appearing in non-relative URI paths.
# For schemes that define a default authority, use an empty authority
if the default is desired.
# For schemes that define an empty path to be equivalent to a path of
"/", use "/".
These rules completely cover scheme, path, and partially cover
authority. Here are some URIs that I can't determine if they are in
canonical form based solely on the rules listed in rfc2396-bis:
http://:@example.com/
http://example.com:80/
http://example.com/gateway.cgi?
http://www.w3.org/2000/01/rdf-schema#
My initial inclination would be to declare all of these as
non-canonical, but there is enough common practice of the last example
that it probably should be an exception.
- Sam Ruby
Received on Monday, 9 August 2004 12:54:33 UTC