We need some function f

> Do we have to decide this issue?  I was hoping not.

It appears that the issues are undecidable, but I agree that
it is possible to move on and leave some things on which people agree
to disagree.

> 
> I was hoping to reduce the problem to one of defining a bunch
> of string functions and so ignore all the angst-producing issues
> about meaning and resolution and application use and such.
> 
> We need some function f that takes an ns-attrib string and
> a base URI (in the RFC 2396 sense) string and returns a
> namespace name:  f(ns-attrib,baseURI) -> nsn.  Possible
> function definitions include:
> 
>   literal:     f(ns-attrib,baseURI) -> ns-attrib
>   forbid:      f(ns-attrib,baseURI) -> ns-attrib
>   absolutize:  f is defined by the algorithm in 5.2 of RFC 2396
>                 and the baseURI is determined by 5.1 of RFC 2396
>   fixed-base:  f is defined by the algorithm in 5.2 of RFC 2396
>                 but the baseURI argument is ignored and a 
>                 constant base URI is used instead
>  
> Note that the "deprecate" option isn't really an option in
> this sense.


I suggest  fixed-base.

Actually I prefer literal, but I think fixed-base does address some
valid concerns with literal, and it has at least some hope of
consensus.

literal would be OK, absolutize isn't a word, so I couldn't vote for
that. forbid should be ruled out as after this  I wouldn't trust
the W3C to keep its recommendation, forbid would just hang around
until it was next thought a good time to re-open the debate and
re-allow relative URI. Whatever happens, the namespace rec should be
made secure so we can't again have it incompatibly changed. Otherwise
how is anyone to build anything with any confidence?

> Having decided the above function that gives us nsn's, then
> for the purposes of the "unique attribute" issue, we need a
> function that takes two nsn's and two local names and returns
> a boolean "unique/non-unique" answer. 

string "character-for-character" equality as in the current rec.

I say character for character, rather than byte for byte, as one
issue that probably should be clarified in the ns rec is what to do
about non ascii characters. Given that XML allows unicode element
names it is not really consistent to force ascii namespace names.

All the other recs that involve URI have some boiler plate text about
utf8 encoding and % escaping the bytes.

What I'd _like_ to be the case is that even for fixed-base
the ns-attrib is allowed to have non ascii characters, the
rfc 5.2 algorithm is run leaving those characters in place
leading to an absolute-uri+frag-id-except-using-unicode-characters.
I think this should be the namespace name and used for equality
testing.  The namespace spec can say that if you are using the
namespace name as a URI to reference a resource then you need
to .... (utf-8 and % encode...)

An alternative would be to put the requirement to utf-8 and % encode
the ns-attribute into your function f above. But that would seriously
complicate namespace parsing, I know in my own tex based XML
typesetter that would be prohibitively expensive to do. (I'd just
document that as an incompatibilty.) Working with TeX puts unnatural
constraints on me that I don't expect to be typical, so I don't
necessarily mean that to be taken into consideration, but I think
a revised NS spec should say _something_ about characters that are not
allowed in URI.

David

Received on Wednesday, 21 June 2000 17:57:28 UTC