Re: CURIEs: A proposal from Harry Halpin on 2006-06-28 (www-tag@w3.org from June 2006)

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Wed, 28 Jun 2006 03:01:30 +0100
To: mark.birbeck@x-port.net, www-tag@w3.org
Message-ID: <44A1E2FA.5000401@ibiblio.org>
Then of my list of options you're more behind:

2) CURIEs become general for XML, but say quite clearly how to
distinguish themselves from QNames. This could be done by using a
different character other than ':', or *by saying that element and
attribute names that use ':' * are QNames, but CURIEs work for
*attribute values that use ':'*. Again, this might break something, not
sure.

Where else should CURIEs be at work in XML besides attribute values?
CURIES are then a way of saying "What appears to be a QName in attribute
value is really a URI, and you should concat the local name and
namespace prefix to get a URI" - that makes perfect sense and helps
solve the devilish QName in attribute values problem.

    As for non-XML syntax (SPARQL, N3, etc.) - well, I would assume it
would then be up to the individual language spec to decide to implement
CURIES or not and then update their standards to make it clear whether
they are using QNames or CURIEs. Although a W3C-wide consensus would
make life easier on everyone for non-XML syntax.

Then confusing things like:

"This note suggests that we overcome this by simply creating a new data
type whose purpose is specifically to allow for the abbreviation of URIs
in exactly this way. This type is called a "CURIE" or a "Compact URI",
and QNames are a subset of this." [1]

Can be deleted and replaced with a paragraph that explains exactly why
CURIEs fill in a hole in Web architecture that QNames were abused for,
and exactly where QNames are clearly needed (i.e. element and attribute
names, and people do know why they are needed there and it's pretty
clear) in XML.

As for the concat rules, I think in general hoping trying to avoid
things like Steve's example of http://www.w3.org/1999/xhtmlindex is
going to be hard without going down Misha's proposed route. However, one
compromise could be that the namespace prefix
"http://www.w3.org/1999/xhtml" of a CURIE can be derefenced, to get a
namespace document that then lists the element and attribute names. If
one was really ambitious one could then start using redirects for those
not-so-nice URIs in Steve's example :)

    I think this sort of solution would keep CURIEs for their good uses,
clear up confusion between CURIEs and QNames without breaking XML specs
(hopefully), and not rely on prose descriptions of what will hopefully
be machine-implemented and rather simple syntax conversion rules.

          cheers,
                harry

[1]http://www.w3.org/2001/sw/BestPractices/HTML/2005-10-27-CURIE





Mark Birbeck wrote:
>
> Hi Harry,
>
> Useful comments, thanks.
>
>> This is exactly the point of my previous e-mail in this thread: It is
>> logically impossible for CURIEs to be in a superset of QNames, because
>> CURIEs are "abbreviations for URIs" and QNames are *not* abbreviations
>> for URIs in the general case. QNames merely use a URI to disambiguate
>> local names, and at this task they work fine. Even if one *assumes
>> concantentation* as a default rule for constructing URIs out of QNames
>> (which is not in the QName spec at all), then as abbreviations for all
>> possible URIs, they are poor - but that wasn't their intended purpose.
>> So, part of the problem is that CURIEs are advertising themselves as
>> replacements or fixes for QNames. I think this reveals a misreading of
>> QNames and this only detracts from the quite good case CURIEs can make
>> for themselves.
>
> The thing is that CURIEs don't (or perhaps that should read "shouldn't
> be claiming to" ;) set out to be a superset of QNames, which I will
> try to explain.
>
> As you say, QNames are very clearly defined in the XML namespaces
> specification, as essentially a way to create scoped names for
> elements and attributes. This is rather a nice trick, and if you are a
> language author it gives you a ready made token for creating scoped
> names in other places, such as XPath functions or XML Schema
> datatypes.
>
> Unfortunately, what has happened is that this rather useful snippet of
> BNF that defines 'a:b' has come to be used in all sorts of places, to
> define scoped names that sometimes don't even have any relationship to
> the original purpose of QNames. So XML Schema creates a datatype
> called "xsd:string", and says that this prefix/suffix combination
> should be interpreted using the definition of QNames. XPath functions
> does the same...as does SPARQL, and so on.
>
> As Henry pointed out in another discussion with me, the XML Schema
> authors were perfectly entitled to use the BNF for QNames to define
> their datatypes. But the tricky bit is that QNames only defines how to
> use the namespaces that are defined in prefixes, when the prefix
> occurs in an XML element or attribute. If the QName is used to define
> something else--even something inside an attribute, like
> xsi:type="xsd:string"--then you don't get the namespaces 'fix-up' for
> free, and you have to explain how it's done in your particular
> language. (For example, XPath uses namespaces from the evaluation
> context, XSLT uses namespaces in the document, etc.)
>
> So, out of the box, QNames gives you *only* a nice handy piece of BNF
> to describe the a:b syntax, and--if you are defining elements and
> attributes--it gives you a way to use a value from a namespace
> declaration instead of your prefix. To use it in XML content, like
> attributes, you have to define the namespace rules yourself. (And run
> the risk of losing the namespaces, as we discussed recently!)
>
> But there's more--QNames has not only been 'stretched' to cover any
> scoped names that want to use namespaces. but they've also been used
> to define URL creation rules. With this we get yet another layer on
> top of our original QName BNF.
>
> Which brings us to CURIEs; one of the reasons I wrote the CURIEs
> proposal was that it seemed to me that this situation could be
> clarified by making explicit the things that people had been (ab)using
> QNames for in the past. If people want to reuse the BNF to describe
> this convenient syntax, for use in other langauges, why not take it
> out into a separate spec? Then we can define the behaviour when used
> anywhere, whether in attribute and element names, inside attributes,
> etc, and also define how URLs are created from these prefix/suffix
> pairs.
>
> But in the process, we should change the name so that it's not
> confusing, with the bonus that this would allow us to leave the term
> QName for what it currently is--a bit of BNF, with some rules for how
> namespaces are used in *element and attribute* names.
>
> I do favour keep the same *syntax* though, so that current documents
> remain backwards compatible, should the language that defines them
> choose to use CURIEs.
>
> In summary, QNames have been stretched and pulled; there's nothing
> necessarily wrong with that, since it reflects a real need, but the
> idea of CURIEs is to try to 'tidy up' this overuse. If we do it right,
> we could end up with a term that we all agree on, rather than the very
> broad term 'QName', which we have at the moment, and as you rightly
> say Harry, no-one actually really knows what it means.
>
> Regards,
>
> Mark
>
>


-- 
		-harry

Harry Halpin,  University of Edinburgh 
http://www.ibiblio.org/hhalpin 6B522426
Received on Wednesday, 28 June 2006 02:01:51 UTC