Re: Guidelines on usage of // in new URI schemes

This was discussed on apps-discuss and the URI list a while back, so I have bcc'd those lists, but I want to focus the discussion on the list, so please only reply there.

In order to handle IDNs appropriately, I would like to make the rule that any scheme that allows non-ASCII or pct-encoded values in the "host" field in the generic syntax MUST allow or mandate that IRI -> URI processing follow IDNa rules. That is, no matter what the scheme, if you have

scheme://     as an IRI, and want to translate it to a URI, you MUST use IDNA to turn it into


no matter what the scheme. This is what you have to do for almost all URI schemes now anyway in order to function properly.

This would change the guidelines on use of "//" for new schemes, but are there any URI schemes in use for which this would actually be a problem in practice?


From: [] On Behalf Of Timur Shemsedinov
Sent: Thursday, August 20, 2009 7:44 AM
To: Eran Hammer-Lahav
Cc: URI;
Subject: [Moderator Action] Re: Guidelines on usage of // in new URI schemes


See RFC 2718 - Guidelines for new URL Schemes

2.1.2 Improper use of "//" following "<scheme>:"

Contrary to some examples set in past years, the use of double
slashes as the first component of the <scheme-specific-part> of a URL
is not simply an artistic indicator that what follows is a URL:
Double slashes are used ONLY when the syntax of the URL's <scheme-
specific-part> contains a hierarchical structure as described in RFC
2396. In URLs from such schemes, the use of double slashes indicates
that what follows is the top hierarchical element for a naming
authority. (See section 3 of RFC 2396 for more details.) URL
schemes which do not contain a conformant hierarchical structure in
their <scheme-specific-part> should not use double slashes following
the "<scheme>:" string.

On Thu, Aug 20, 2009 at 8:48 AM, Eran Hammer-Lahav <<>> wrote:
I am in the process of proposing a new URI scheme to identify user accounts [1]. This is part of the WebFinger protocol [2] effort.

This email is *not* an invitation to debate the merits of this new URI scheme (just yet). I am sure we will have many lively discussions about it shortly but I would like to present a proposal before we have a public debate about it here.

The new scheme has two components, a local identifier (username, screenname, handle, etc.) and a host (which can resolve and authenticate the local identifier). When looking at the URI specification (RFC 3986) and at the new URI guidelines (BCP 35), it is hard to figure out what is an appropriate use of // in new schemes.

In this case, we have a requirement to keep the URI (the part after the scheme:) looking as close to an RFC-822 identifier (username@host) and that means two options:


The 'username@host' part seems to fit perfectly into the URI authority as defined by RFC 3986. However, since the URI does not have a path, it does not really contain a hierarchical structure (just the top level host).

The benefit of using // in this case is that existing URI parsing code can be used unmodified to process the new URI. It is a simple profile which only allows the userinfo and host subcomponents of the authority component, and no other URI components. Since the new scheme will be often used with URI templates and other facilities often used with http: URIs, it is very convenient to have a common structure (even if it is only a subset). I don't see any down side to using // other than defying expectations established by the mailto: URI scheme.

The benefit of not using // is that it makes the URI follow the well establish pattern in mailto: and save two bytes. The down side is that it requires spelling out how to break the URI path into sub components specific to this scheme.

So far the feedback I received is focus on style which is perfectly valid, but I want to make sure I am not missing anything. My preference is to reuse as much as possible and therefore include the //.

Any suggestions?


Apps-Discuss mailing list<>

Received on Sunday, 11 October 2009 19:10:11 UTC