Re: [Minutes] 27 Jan 2003 TAG teleconf (httpRange-14, arch doc, IRIEverywhere-27, binaryXML-30, xmlProfiles-29)

On Monday, February 3, 2003, 8:54:32 PM, Martin wrote:


MD> At 20:20 03/01/27 -0500, Ian B. Jacobs wrote:

>>Minutes of the 27 Jan 2003 TAG teleconf available as
>>HTML [1] and as text below.

>>   2.3 IRIEverywhere-27

>>      [25] http://www.w3.org/2001/tag/ilist#IRIEverywhere-27

>>    [Zakim]
>>    DanCon, you wanted to suggest the value of having %7E specified to be
>>           equivalent to %7e is purely aesthetic, and not *nearly* worth
>>           the cost.

OK so lets look at knock on effects here. Dan was not, I claim,
looking at these effects when he made his comment;  in which case his
position seems very reasonable. After all these escapes are used very
infrequently.

But it is very damaging.  It would scupper IRIs.

Suppose there is some Unicode character FOO and it maps to %ab%cd%ef
in UTF-8 (it won't map from those precise values, there is no such
character, this is just an example).

It would be highly desirable for FOO used in an IRI and the hexified
version of FOO used in a URI to compare the same when comparing two
URIs. If this is not done, then IRI-URI is a one-way street.

For this to work in any sensible manner, then clearly it is not enough
for FOO to compare the same as %ab%cd%ef. It also has to compare the
same as %AB%CD%EF and %Ab%cd%eF and ....

There are two ways to do this, one is to forbid one of the cases of
hexified a..f and the other is to define them as the same in a hex
escape.

The third way, the way where %ab is not equal to %AB, means that we
can just give up on making FOO compare equal to %ab%cd%ef and thus, we
can just give up on any roundtripping from IRI to URI and thus, IRI
becomes merely a theoretical possibility. It becomes something that
exists in a spec but actual XML files contain a bunch of illegible
hexified nonsense.

Thus, to get to this desirable goal, then for URIs %ab and %AB and %Ab
and %aB have to compare the same. This isn't "merely aesthetic' is is
what IRI needs to build on.

MD> Currently, Namespaces in XML 1.1 (Candidate Rec) specifies that for
MD> purposes of namespace equivalence, '%7e', '%7E', and '~' are different
MD> (see http://www.w3.org/TR/xml-names11/#IRIComparison).

Yes. This should change.


-- 
 Chris                            mailto:chris@w3.org

Received on Monday, 3 February 2003 16:51:25 UTC