- From: David Carlisle <david@dcarlisle.demon.co.uk>
- Date: Wed, 21 Jun 2000 23:02:15 +0100 (BST)
- To: pgrosso@arbortext.com
- CC: xml-uri@w3.org
> Do we have to decide this issue? I was hoping not. It appears that the issues are undecidable, but I agree that it is possible to move on and leave some things on which people agree to disagree. > > I was hoping to reduce the problem to one of defining a bunch > of string functions and so ignore all the angst-producing issues > about meaning and resolution and application use and such. > > We need some function f that takes an ns-attrib string and > a base URI (in the RFC 2396 sense) string and returns a > namespace name: f(ns-attrib,baseURI) -> nsn. Possible > function definitions include: > > literal: f(ns-attrib,baseURI) -> ns-attrib > forbid: f(ns-attrib,baseURI) -> ns-attrib > absolutize: f is defined by the algorithm in 5.2 of RFC 2396 > and the baseURI is determined by 5.1 of RFC 2396 > fixed-base: f is defined by the algorithm in 5.2 of RFC 2396 > but the baseURI argument is ignored and a > constant base URI is used instead > > Note that the "deprecate" option isn't really an option in > this sense. I suggest fixed-base. Actually I prefer literal, but I think fixed-base does address some valid concerns with literal, and it has at least some hope of consensus. literal would be OK, absolutize isn't a word, so I couldn't vote for that. forbid should be ruled out as after this I wouldn't trust the W3C to keep its recommendation, forbid would just hang around until it was next thought a good time to re-open the debate and re-allow relative URI. Whatever happens, the namespace rec should be made secure so we can't again have it incompatibly changed. Otherwise how is anyone to build anything with any confidence? > Having decided the above function that gives us nsn's, then > for the purposes of the "unique attribute" issue, we need a > function that takes two nsn's and two local names and returns > a boolean "unique/non-unique" answer. string "character-for-character" equality as in the current rec. I say character for character, rather than byte for byte, as one issue that probably should be clarified in the ns rec is what to do about non ascii characters. Given that XML allows unicode element names it is not really consistent to force ascii namespace names. All the other recs that involve URI have some boiler plate text about utf8 encoding and % escaping the bytes. What I'd _like_ to be the case is that even for fixed-base the ns-attrib is allowed to have non ascii characters, the rfc 5.2 algorithm is run leaving those characters in place leading to an absolute-uri+frag-id-except-using-unicode-characters. I think this should be the namespace name and used for equality testing. The namespace spec can say that if you are using the namespace name as a URI to reference a resource then you need to .... (utf-8 and % encode...) An alternative would be to put the requirement to utf-8 and % encode the ns-attribute into your function f above. But that would seriously complicate namespace parsing, I know in my own tex based XML typesetter that would be prohibitively expensive to do. (I'd just document that as an incompatibilty.) Working with TeX puts unnatural constraints on me that I don't expect to be typical, so I don't necessarily mean that to be taken into consideration, but I think a revised NS spec should say _something_ about characters that are not allowed in URI. David
Received on Wednesday, 21 June 2000 17:57:28 UTC