Re: Guidelines on usage of // in new URI schemes

Hello Larry,

On 2009/10/12 4:09, Larry Masinter wrote:
> This was discussed on apps-discuss and the URI list a while back, so I have bcc'd those lists, but I want to focus the discussion on the public-iri@w3.org list, so please only reply there.
>
> In order to handle IDNs appropriately, I would like to make the rule that any scheme that allows non-ASCII or pct-encoded values in the "host" field in the generic syntax MUST allow or mandate that IRI ->  URI processing follow IDNa rules. That is, no matter what the scheme, if you have
>
> scheme://nonascii.name/path/here     as an IRI, and want to translate it to a URI, you MUST use IDNA to turn it into
>
> scheme://alabel.for.nonascii.name/ascii.for.path/ascii.for.here
>
> no matter what the scheme. This is what you have to do for almost all URI schemes now anyway in order to function properly.

It is indeed currently advisable, but it's not the best long-term 
solution. And it's not necessary for new schemes. Any new scheme has to 
be doing some work to correctly getting resolved, so it can as well make 
sure that the scheme implementations implement %-escaping in the domain 
name part.

> This would change the guidelines on use of "//" for new schemes, but are there any URI schemes in use for which this would actually be a problem in practice?

One example is Email (EAI, mailto: or a separate scheme) where at the 
envelope and header level, UTF-8 is used. It would be weird if we would 
first resolve to punycode, then had to go back to UTF-8, and then back 
again to punycode when doing the MX lookup.

Regards,   Martin.

> Larry
> --
> http://larry.masinter.net
>
> From: apps-discuss-bounces@ietf.org [mailto:apps-discuss-bounces@ietf.org] On Behalf Of Timur Shemsedinov
> Sent: Thursday, August 20, 2009 7:44 AM
> To: Eran Hammer-Lahav
> Cc: URI; apps-discuss@ietf.org
> Subject: [Moderator Action] Re: Guidelines on usage of // in new URI schemes
>
> Hello
>
> See RFC 2718 - Guidelines for new URL Schemes
> http://www.ietf.org/rfc/rfc2718.txt
>
> 2.1.2 Improper use of "//" following "<scheme>:"
>
> Contrary to some examples set in past years, the use of double
> slashes as the first component of the<scheme-specific-part>  of a URL
> is not simply an artistic indicator that what follows is a URL:
> Double slashes are used ONLY when the syntax of the URL's<scheme-
> specific-part>  contains a hierarchical structure as described in RFC
> 2396. In URLs from such schemes, the use of double slashes indicates
> that what follows is the top hierarchical element for a naming
> authority. (See section 3 of RFC 2396 for more details.) URL
> schemes which do not contain a conformant hierarchical structure in
> their<scheme-specific-part>  should not use double slashes following
> the "<scheme>:" string.
>
> On Thu, Aug 20, 2009 at 8:48 AM, Eran Hammer-Lahav<eran@hueniverse.com<mailto:eran@hueniverse.com>>  wrote:
> I am in the process of proposing a new URI scheme to identify user accounts [1]. This is part of the WebFinger protocol [2] effort.
>
> This email is *not* an invitation to debate the merits of this new URI scheme (just yet). I am sure we will have many lively discussions about it shortly but I would like to present a proposal before we have a public debate about it here.
>
> The new scheme has two components, a local identifier (username, screenname, handle, etc.) and a host (which can resolve and authenticate the local identifier). When looking at the URI specification (RFC 3986) and at the new URI guidelines (BCP 35), it is hard to figure out what is an appropriate use of // in new schemes.
>
> In this case, we have a requirement to keep the URI (the part after the scheme:) looking as close to an RFC-822 identifier (username@host) and that means two options:
>
> acct:username@host
> acct://username@host
>
> The 'username@host' part seems to fit perfectly into the URI authority as defined by RFC 3986. However, since the URI does not have a path, it does not really contain a hierarchical structure (just the top level host).
>
> The benefit of using // in this case is that existing URI parsing code can be used unmodified to process the new URI. It is a simple profile which only allows the userinfo and host subcomponents of the authority component, and no other URI components. Since the new scheme will be often used with URI templates and other facilities often used with http: URIs, it is very convenient to have a common structure (even if it is only a subset). I don't see any down side to using // other than defying expectations established by the mailto: URI scheme.
>
> The benefit of not using // is that it makes the URI follow the well establish pattern in mailto: and save two bytes. The down side is that it requires spelling out how to break the URI path into sub components specific to this scheme.
>
> So far the feedback I received is focus on style which is perfectly valid, but I want to make sure I am not missing anything. My preference is to reuse as much as possible and therefore include the //.
>
> Any suggestions?
>
> EHL
>
> [1] http://www.hueniverse.com/hueniverse/2009/08/making-the-case-for-a-new-acct-uri-scheme-for-accounts.html
> [2] http://code.google.com/p/webfinger
> _______________________________________________
> Apps-Discuss mailing list
> Apps-Discuss@ietf.org<mailto:Apps-Discuss@ietf.org>
> https://www.ietf.org/mailman/listinfo/apps-discuss
>
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

Received on Monday, 12 October 2009 05:49:59 UTC