Comments on draft-hansen-iri-4395bis-irireg-00.txt

Hi,

  http://tools.ietf.org/html/draft-hansen-iri-4395bis-irireg-00 notes
"Previously, those who wish to describe resource identifiers that are
useful as IRIs were encouraged to define the corresponding URI syntax,
and note that the IRI usage follows the rules and transformations
defined in [6]. This document changes that advice to encourage explicit
definition of the scheme and allowable syntax elements within the larger
character repertoire of IRIs, as defined by [7]."

I am concerned that this would further draw a distinction between the
characters that occur literally in an identifier and characters that
are percent-encoded. I am not entirely sure in fact how to read RFC
3987 on this (it starts out saying it's just like URIs, except that
there are more unreserved characters, but then excludes private use
code points from the set of unreserved characters).

Let's say I make a scheme where the scheme-specific part can only be
"ö". Since "ö" is an unreserved character, I might be inclined to say

  def = "example:" %x00F6;

but that would not work as "example:%c3%b6" is essentially defined as
equivalent to "example:ö". The definition would have to account for a
level of indirection at some point to remove percent-encoding, so I'd
think you cannot quite distinguish between defining an URI scheme and
an IRI scheme, so far the only difference could be in percent-encoded
private use characters. I'd rather remove that difference, and am not
sure what the actual change there would be.

As an unrelated point, a common confusion is that people think the
fragment identifier is a scheme-specific, it's common for proposed
registrations to define the fragment as part of the scheme, and it is
unfortunately common that fragment identifiers are in fact treated
as data, like "javascript:open('#example')" or "data:,#example" in
implementations. However, fragment identifiers are part of the generic
framework, the scheme-specific part ends where the fragment begins.
I think 4395bis should discuss this problem in some detail.

(Finally, please do use named references and not "[7]".)

regards,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Tuesday, 5 October 2010 05:44:25 UTC