Re: An Internet-Draft on literal scoped addresses with accompanying zone IDs in URIs from Roy T. Fielding on 2004-11-20 (uri@w3.org from November 2004)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Fri, 19 Nov 2004 19:03:51 -0800
To: Bill Fenner <fenner@research.att.com>
Cc: ipv6@ietf.org, uri@w3.org, bob.hinden@nokia.com
Message-Id: <CE3DB8D8-3AA0-11D9-BB50-000393753936@gbiv.com>

On Nov 19, 2004, at 12:57 PM, Bill Fenner wrote:
>> I think loosing the ability to cut and paste these addresses is a
>> problem.  The % is in widespread usage today.
>
> Indeed, that's why this whole thing is a sticky issue and there's
> no obvious answer.  My FreeBSD and MacOS machines all use the % too,
> and have for years.

I don't think that they are actively used for significant operations.
Yes, they are implemented (inconsistently) on multiple platforms
(some allow names to occur after the '%", while others assume that
the zone ID will be a small integer), but I do not think that adding
a new delimiter would have an adverse affect on working software.
In particular, it would give all of the implementations a single
standard definition to implement towards.

>> My dump question (that exposes my lack of knowledge about URIs/etc.) 
>> is
>> since the literal IPv6 address are enclosed in "[" "]" to allow for 
>> the ":"
>> in the literal IPv6 address, why can't the "%" be used in the same
>> way?  For example:
>>
>>   http://[fe80::20d:60ff:fe2f:8df5%4]
>>
>> Please excuse my ignorance on this, but it would be good to explain 
>> this
>> (and include this information in the draft).
>
> You're right, we probably distilled the discussion a little too
> much.  We should add a third entry to the list and list its pros
> and cons for a bare %.
>
> The basic issue is how special % is in URLs, because of
> percent-encoding.  Section 2.4 of draft-fielding-uri-rfc2396bis
> (the full Standard URI spec, currently in the RFC-Editor's queue)
> says:
>
>    Because the percent ("%") character serves as the indicator for
>    percent-encoded octets, it must be percent-encoded as "%25" in order
>    for that octet to be used as data within a URI.
>
> The newer IRI spec (in IESG Evaluation; draft-duerst-iri-10.txt)
> specifies an encoding of URIs to IRIs that assumes that any percent
> anywhere in the URI begins a percent-encoded octet.  Allowing a
> bare "%" would complicate these rules quite a bit.  There would be
> no way to know without parsing the URI further whether the % began
> a %-encoded octet or not.  (An accidental example of how ambiguous
> this can be is the one of the link-local addresses of my home system:
> fe80::240:5ff:fe42:d6de%de1 - %de is a legal percent-encoded octet,
> or the introduction of a zone ID "de1".)

More importantly, use of percent in the host portion of a URI
is known to be non-interoperable.  Most compliant implementations
do not expect percent-encodings in the IP literals and thus will
attempt to pass the '%de1' on to the system library for conversion
to an address.  Other implementations will see it as an error.
Some implementations will look for and decode anything that looks
like a percent-encoded octet as soon as the component is extracted
from the URI, regardless of the specification.  Thus, the only
interoperable URIs are those in which percent is not used
in the host component, which is why the URI syntax does not allow
them in host except as required for IRIs.

There is nothing we can do in the specifications to change the
fact that using a % as a zone ID separator in URIs will result in
interoperability failures.  Those applications are already deployed.
It would be easier to change all of the deployed operating systems
that contain IPv6 to allow an additional delimiter character if
cut-and-paste is really important.

Cheers,

Roy T. Fielding                            <http://roy.gbiv.com/>
Chief Scientist, Day Software              <http://www.day.com/>

Received on Saturday, 20 November 2004 03:04:36 UTC