W3C home > Mailing lists > Public > www-tag@w3.org > April 2003

Re: Grinding to a halt on Issue 27.

From: Rick Jelliffe <ricko@topologi.com>
Date: Wed, 30 Apr 2003 23:54:56 +1000
Message-ID: <046b01c30f20$15742850$4bc8a8c0@AlletteSystems.com>
To: <www-tag@w3.org>

Elliotte Rusty Harold  writes

> No, not at all. It just needs to move forward in the right order. 
> First internationalized domain names, then IRIs, and then the various 
> specs such as  Namespaces in XML 1.1 that depend on IRIs. It's 
> ridiculous to publish a spec with a normative dependence on another 
> spec that's not yet complete. 

I think that sequence is spurious. A single, definite, implemented, working 
IRI-ish convention already permeates throughout XML and related specs.

Namespace URIs now are the odd man out.  As far as waiting for RFCs for 
IRIs, let the dead bury the dead: bringing URIs in XML namespaces into
line with anyURIs everywhere else in XML (i.e. that it is the using implementation's
task to do delimiting, not every flipping application that writes URLs) 
may help the IRI spec see the light of day, in which case  (if it fits) 
W3C specs can normatively reference it. 

Elliotte is not giving an argument against allowing non-ASCII characters in namespaces,
that is an argument against normative references to the IRI draft. The original
issue was on whether to endorse IRIs everywhere[0][0a].  (If TAG just endorsed
Charmod, we could have escaped this: Charmod gives a mapping and says
"this may be what IRIs give" as a notice.  Charmod's issues (and its solutions)
will not go away by pretending it doesn't exist.)

In [0a], Julian's posting [0b] is summarized as 'feedback that this "upgrade" is 
not without some cost'.  However, that is not what Julian said AFAICS. His comment
is not anti-non-ASCII per se but concerned with namespace comparisons. 
I think this is gist of Rod Fieldings comments: he thinks that going all ASCII
will solve things.

I think there is a missing category here: an anyURI[1] with no escapes of non
ASCII characters.   Lets call them "Literal-IRIs".    I think it should be considered
best practise to use Literal-IRIs in W3C specs, with URIs allowed also,
but letting people know that anyURIs that are neither fish (Literal-IRI) nor
fowl (URI) should be used at peril, especially if there is some round-tripping or
signing aspect. 

As for comparisons, surely it doesn't matter whether an application converts to
Literal-IRIs or to URLs (using Larry's UTF-8 convention) for comparison;
all that matters is that both strings being compared have been canonicalized
to the same spec. Remembering, this really shouldn't involve much new code,
because XML and related specs already have URIs all over the place.

Rick Jelliffe


Currently Namespace URLs are the odd man out at W3C.

Non-ASCII characters are allowed by anyURI[1], with a conversion method
to URIs by the method in Charmod[2], are already adopted by 
XPointer[3], are in HTML[4] under the guise of error correction, Xlink[5], and
XML 1 [6]. (Larry Masinter's 1999 RFC 2718[7] s. 2.2.5  gives UTF-8 for the escapes)

[0a] http://lists.w3.org/Archives/Public/www-tag/2002Oct/0186
[0b] http://lists.w3.org/Archives/Public/xml-names-editor/2002Sep/0014.html
[1] http://www.w3.org/TR/xmlschema-2/#anyURI
[2] http://www.w3.org/TR/2001/WD-charmod-20010126/#sec-URIs
[3] http://www.w3.org/TR/2001/WD-xptr-20010108/#uri-escaping
[4] http://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1
[5] http://www.w3.org/TR/xlink/#link-locators
[6] http://www.w3.org/TR/REC-xml#sec-external-ent
[7] http://www.ietf.org/rfc/rfc2718.txt
Received on Wednesday, 30 April 2003 09:51:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:55:58 UTC