Re: Various issues with using CURIEs in OWL

(Sean is my AC rep.)
On 10 Apr 2009, at 00:15, Shane McCarron wrote:

> My (personal) comments inline:
> Bijan Parsia wrote:
>> The OWL Working Group had intended to delegate our URI abbreviation  
>> mechanisms both for in-spec and in-concrete-syntax use. OWL has a  
>> number of different concrete serializations (including 2 XML based  
>> and 2 non-XML based), all of which use (or I would like to use)  
>> CURIEs.
>> Unfortunately, while trying to use the CURIE spec, I (and others)  
>> have found that the current CURIE spec does not meet the WG needs  
>> even putting aside concerns about the ultimate disposition of the  
>> document:
>> 1) For non-XML host language: The CURIE spec provides no mechanism  
>> (although it provides permission) for excluding characters from the  
>> syntax of the local part of CURIEs. This means that in host  
>> languages which use symbols like ")" or "[" as part of their  
>> syntax, we run into parsing ambiguities. Note that safe CURIES do  
>> not solve this problem as the safe CURIE delimiters are common host  
>> language delimiters.
>> PROPOSED FIX: Ideally, there would be a "mimimalistic" CURIE  
>> profile, ideally something like SPARQL's abbreviation mechanism.  
>> Even QNames would be fine (though we'd recommend the spec point out  
>> that to cover all URIs there should be a non-abbreviated form).
> The lexical form of a CURIE is an optional prefix, separator, and a  
> reference.  Are you saying that the characters permitted in prefix  
> (NCName) or reference (irelative-ref as defined in the IRI spec) are  
> too rich a set of characters?

Reference, yes.

>  And that in your use you needed to make this collection of  
> characters less rich?


>  If so, I agree that this is permitted by the specification.

But this gives me no reason to use the spec, esp. with a normative  

Without a specific subsetting mechanism (e.g., for the datatype, one  
could define by restriction) I think adopting a different set of  
CURIEs just means not adopting the CURIE spec.

Contrast our use of the IRI  and SPARQL spec:

fullIRI := an IRI as defined in [RFC3987], enclosed in a pair of < (U 
+3C) and > (U+3E) characters
prefixName := a finite sequence of characters matching the as PNAME_NS  
production of [SPARQL]

I think there are three reasonable categories of CURIE:

	Exactly QName
	What SPARQL currently does
	Full irelative-ref for reference

There are a couple of others I could imagine (i.e., with %encoding for  
strict acsii). But without at least these I don't think the CURIE spec  
is something SPARQL or OWL should use.
>> Note that *permission* to make a subset isn't all that helpful. I  
>> mean, then we're
>> just doing our own thing, yeah?
> Not really - it means you are defining a subset or profile of a  
> common mechanism,

We disagree strongly. Without a defined subsetting mechanism, it's  
just not helpful. It *might* have been helpful with defined processing  
models...but we don't have that.

Thus, you've not convinced me. At the moment I am better off ignoring  
the CURIE spec.

> and that a CURIE expressed in that subset would be semantically  
> still a CURIE.  One reason for using a common datatype is that it  
> helps with comprehension.

? Comprehension support is not a goal. Specification factoring or  
implementation interop are.

I find it very hard to believe that having to read another spec  
improves comprehension.

>> EDITORIAL NOTE: Many of us found the organization of the spec, and  
>> especially of the normative parts, very confusing. See:
>>   < 
>> >
>> I suggest that "Usage" and "Examples" be consolidated, and that  
>> there are two normative sections, "Syntax" and "Incorporating  
>> CURIEs into Host Languages" which contain the respective  
>> constraints. The second section could usefully be broken down into  
>> "XML host languages" and "Non-XML host languages".
> Thanks for this.  We are already done with CR more or less, but I  
> will see what I can do.

I don't see how you can get out of CR to PR, looking at your  
implementation report. At this stage, I'm now asking Sean, my AC rep,  
to oppose such a transition.

Speaking as a spec implementor who sincerely tried to use the CURIE  
spec, I think there are problem that merit serious changes to the  
design of the language. This means another LC, if I'm not mistaken.

>> 2) For XML host languages: The requirement to support the XML  
>> namespace based prefix declaration mechanism, even when an  
>> alternative mechanism is supplied, is simply a non-starter. Many in  
>> the XML world are hostile to the namespace based overloaded (even  
>> for proper QNames! see RELAX NG and Schematron). But being forced  
>> to support *two* mechanisms, especially when one of them isn't  
>> desired, is unnecessarily restrictive and leads to the second  
>> mechanism not being used:
>>   <>
> The XHTML 2 Working Group has already agreed to remove this  
> restriction.

Great. That seems to trigger another LC.

>  In fact, what we agreed was that it was the host language's  
> responsibility to define its prefix mapping mechanim(s).

Well...if that means that we all reinvent ours, then I don't think  
it's a good idea. For me, this means that the CURIE spec is not a rec  
track sensible document, but would be better as a note.

>> 3) For XML host languages: There's no reason not to have a standard  
>> prefix declaration mechanism in the XML namespace. What value is  
>> there in letting XML host languages coin a bunch of different ones?
>> For example, <xml:Prefix name="" IRI=""/> is (basically) the syntax  
>> we're adopting, except with Prefix in the OWL namespace.
> Perhaps.  The XHTML 2 Working Group does not have authority to mess  
> in the xml space.

Ok, use your own, namespace. xml namespace would be better.

>  I am sure the group will discuss your suggestion.


>> 4) Processing: In some languages, multiple declarations of a prefix  
>> have an overriding behavior. In OWL we chose to make that a syntax  
>> error. The CURIE spec should make clear the processing model.
> We believe the processing model is completely host-language specific.

I don't think that's helpful. There are at least 2 sensible, fairly  
common, processing models:
	Error on redefinition
	Lexically nearest wins

Both are in common use. Define them. Provide a way to reference them.

>  The concept of a CURIE, that is an abbreviation that maps to an  
> IRI, is general.  The expression of that concept in a host language  
> is necessarily going to be related to that host language.  For  
> example, were you to use CURIEs in HTML you would not want to use  
> some "xml" mechanism to map a prefix.

Sure, but, uhm, HTML is not an XML host language. And I'm confused as  
to why we're talking syntax rather than processing.

>> To sum, I, personally, don't think the CURIE spec helps either with  
>> implementation interop or with spec factoring, though I think it  
>> could be made to. Thus, in its current form, there's no point in  
>> citing it and, thus, no real point in it being a recommendation.  
>> The minimal necessary changes from my pov are:
>>    A) A proper XML mechanism with no requirement to suport xmlns
>>    B) Sensible profiles (I suggest, QName/RDF, SPARQL, and ALL)
>>    C) A processing model
>> C could maybe be dropped. A is totally required. I just won't  
>> adhere, or recommend anyone adhere, to the requirement to use  
>> xmlns. It's a nonstarter. Thus I won't use or recommend people use  
>> the CURIE spec (in its current form) for XML based host languages.
> I think we have already addressed this requirement.  Thanks for  
> reinforcing it though.

Great! I look forward to the next LC.

>> I won't use or recommend citing the CURIE spec without B for non- 
>> XML host languages. If you are happy with this being "using curies"  
>> then ok :)
>> Hope this helps.
> I think it did!  I really appreciate your taking the time to send  
> this.  The working group will get you a formal response in due course.



Received on Friday, 10 April 2009 10:35:16 UTC