Detection, where? How? (was Re: Allowed types of punning (ISSUE-114))

On 10 Jul 2008, at 12:40, Bijan Parsia wrote:
> On 10 Jul 2008, at 12:25, Rob Shearer wrote:
>
>> I believe I might have been the person who raised the initial  
>> issue about object-property/data-property punning,
>
> I don't think so, but we'd have to go back to the archives. See  
> Peter's prior message.
>
>> but was not a party the process of fixing the problem. I continue  
>> to feel that such punning is a) not useful, and b) extremely  
>> likely to cause (and exacerbate existing) user error. User  
>> confusion between object and data properties is quite common in  
>> practice, and right now misuse can be detected as a syntax error.  
>> Allowing such punning would make these common errors completely  
>> undetectable
>
> This isn't true at all. It's perfectly detectable, just not a  
> syntax error. You can detect it syntactically and a good lint tool  
> should do exactly that.
[snip]

I'd like to make perfectly clear that I did not intend this as a  
"bravo!" point. There are all sorts of ways to handle certain aspects  
of the language that we think, for a variety of reasons, are  
undesirable. We have to balance education, usability, utility, and,  
yes, marketing, mindshare, and outreach considerations.

A syntactic error is generally, in the current ecosystem of RDF, XML,  
and OWL (unlike, say, HTML, CSS, or Javascript) a catestrophic error.  
That is, it generally prevents you from doing anything with the file  
until you fix the error. Indeed, currently, it can prevent you from  
opening the document in an ontology development environment.

An advantage of this is that you're forced to deal with errors.  
Sometimes, at least, figuring out the problem is a bit easier (than  
what? well, certainly than silent repairs that alter the meaning). A  
big disadvantage is that (catestrophic) syntax errors are often hard  
to deal with and, in some sense, trivial. It certainly reduces the  
tools you have (since, often you can't run it through the reasoner;  
if you don't care about the "erroneous" bits, then that's a real pain.)

Let's call a strict semantic error something that "crashes" a class  
or the ontology, i.e., an unsatisfiability or contradiction. Again,  
easy detection, and you can't not fix them (if you want to use that  
bit). But it's not always the case that contradictory information is  
an error (hence paraconsistent logics, argument systems, etc.), and  
it's certainly not the case that it's always best to deal with it  
right away. Unsatisfiabilities are nicer because they don't  
obliterate your whole ontology (the tool infrastructure is going to  
get better on contradictions, of course). But these are often much  
harder to grasp than (already tricky) syntax errors. So, for some  
things (e.g., having "rob"^^xsd:int) we might well, overall, prefer  
to make it a syntax error since it's easy to detect, report, and  
repair (though, I still may not want to repair it because I don't  
know what the author intended).

Finally, we have "soft" semantic errors. That is, things which  
typically don't mean what people intend it to mean, or they otherwise  
get wrong which may or may not cause problems for the ontology. I.e.,  
they are "legal" but typically not useful. This is where lint tools  
and warning play a role. (These also are on the rise in the OWL  
infrastructure for good reason.)

Finally, we have to consider that we're in a very complex stack of  
technology. There can be XML syntax erros, RDF syntax erros, and OWL  
DL syntax (species) errors. Few tools integrate all of them, and they  
often have a very different flavor (e.g., a non-simple role in a  
cardinality restriction). I do hope that tools get better and making  
checking an owl file closer to a "one pass" experience (even if it  
has two aspects, errors and warnings) rather than the multiple passes  
it generally is now. Furthermore, we have a flavor of OWL, OWL Full,  
which takes all RDF graphs as syntactically legal. (That, itself, is  
hugely confusing :()

I don't think it's ideal to make any sort of punning either a  
syntactic or a semantic error in the current, or near forseeable/ 
acheivable, ecosystem. This means that we need to handle certain  
problematic constructs at different levels. I think encouraging  
"strict" vs. "lax" modes of warning, for example, will result overall  
in a better experience for people coming from RDF land.

(Forbidding object/data, class/type punning seems reasonable to me  
because the given solution to the technical problems of context in  
RDF were heavyweight for RDF users (lots of new vocabulary). Even a  
lighter weight solution (like urn:owl:data disambiguation) are fairly  
costly. Whether they are, in the end, more costly that they were  
worth (considering things like representation evolution and dublin  
core like practices) is an open question to me.)

Cheers,
Bijan.

Received on Thursday, 10 July 2008 12:24:16 UTC