W3C home > Mailing lists > Public > uri@w3.org > October 2009

Re: Guidelines on usage of // in new URI schemes

From: Larry Masinter <masinter@adobe.com>
Date: Sun, 11 Oct 2009 12:09:32 -0700
To: "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
Message-ID: <8B62A039C620904E92F1233570534C9B0118DC469C8A@nambx04.corp.adobe.com>
This was discussed on apps-discuss and the URI list a while back, so I have bcc'd those lists, but I want to focus the discussion on the public-iri@w3.org list, so please only reply there.

In order to handle IDNs appropriately, I would like to make the rule that any scheme that allows non-ASCII or pct-encoded values in the "host" field in the generic syntax MUST allow or mandate that IRI -> URI processing follow IDNa rules. That is, no matter what the scheme, if you have

scheme://nonascii.name/path/here     as an IRI, and want to translate it to a URI, you MUST use IDNA to turn it into

scheme://alabel.for.nonascii.name/ascii.for.path/ascii.for.here

no matter what the scheme. This is what you have to do for almost all URI schemes now anyway in order to function properly.

This would change the guidelines on use of "//" for new schemes, but are there any URI schemes in use for which this would actually be a problem in practice?

Larry
--
http://larry.masinter.net

From: apps-discuss-bounces@ietf.org [mailto:apps-discuss-bounces@ietf.org] On Behalf Of Timur Shemsedinov
Sent: Thursday, August 20, 2009 7:44 AM
To: Eran Hammer-Lahav
Cc: URI; apps-discuss@ietf.org
Subject: [Moderator Action] Re: Guidelines on usage of // in new URI schemes

Hello

See RFC 2718 - Guidelines for new URL Schemes
http://www.ietf.org/rfc/rfc2718.txt

2.1.2 Improper use of "//" following "<scheme>:"

Contrary to some examples set in past years, the use of double
slashes as the first component of the <scheme-specific-part> of a URL
is not simply an artistic indicator that what follows is a URL:
Double slashes are used ONLY when the syntax of the URL's <scheme-
specific-part> contains a hierarchical structure as described in RFC
2396. In URLs from such schemes, the use of double slashes indicates
that what follows is the top hierarchical element for a naming
authority. (See section 3 of RFC 2396 for more details.) URL
schemes which do not contain a conformant hierarchical structure in
their <scheme-specific-part> should not use double slashes following
the "<scheme>:" string.

On Thu, Aug 20, 2009 at 8:48 AM, Eran Hammer-Lahav <eran@hueniverse.com<mailto:eran@hueniverse.com>> wrote:
I am in the process of proposing a new URI scheme to identify user accounts [1]. This is part of the WebFinger protocol [2] effort.

This email is *not* an invitation to debate the merits of this new URI scheme (just yet). I am sure we will have many lively discussions about it shortly but I would like to present a proposal before we have a public debate about it here.

The new scheme has two components, a local identifier (username, screenname, handle, etc.) and a host (which can resolve and authenticate the local identifier). When looking at the URI specification (RFC 3986) and at the new URI guidelines (BCP 35), it is hard to figure out what is an appropriate use of // in new schemes.

In this case, we have a requirement to keep the URI (the part after the scheme:) looking as close to an RFC-822 identifier (username@host) and that means two options:

acct:username@host
acct://username@host

The 'username@host' part seems to fit perfectly into the URI authority as defined by RFC 3986. However, since the URI does not have a path, it does not really contain a hierarchical structure (just the top level host).

The benefit of using // in this case is that existing URI parsing code can be used unmodified to process the new URI. It is a simple profile which only allows the userinfo and host subcomponents of the authority component, and no other URI components. Since the new scheme will be often used with URI templates and other facilities often used with http: URIs, it is very convenient to have a common structure (even if it is only a subset). I don't see any down side to using // other than defying expectations established by the mailto: URI scheme.

The benefit of not using // is that it makes the URI follow the well establish pattern in mailto: and save two bytes. The down side is that it requires spelling out how to break the URI path into sub components specific to this scheme.

So far the feedback I received is focus on style which is perfectly valid, but I want to make sure I am not missing anything. My preference is to reuse as much as possible and therefore include the //.

Any suggestions?

EHL

[1] http://www.hueniverse.com/hueniverse/2009/08/making-the-case-for-a-new-acct-uri-scheme-for-accounts.html
[2] http://code.google.com/p/webfinger
_______________________________________________
Apps-Discuss mailing list
Apps-Discuss@ietf.org<mailto:Apps-Discuss@ietf.org>
https://www.ietf.org/mailman/listinfo/apps-discuss
Received on Sunday, 11 October 2009 19:10:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:42 GMT