Re: WSDL WG request for adding multiple version extensibility into Schema 1.1 from Bijan Parsia on 2004-02-15 (www-ws-desc@w3.org from February 2004)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Sun, 15 Feb 2004 11:59:05 -0500
To: "David Orchard" <dorchard@bea.com>
Cc: <www-ws-desc@w3.org>, "'Mark Baker'" <distobj@acm.org>
Message-Id: <434DDB38-5FD8-11D8-BFA2-0003936A0B26@isr.umd.edu>
On Feb 15, 2004, at 1:31 AM, David Orchard wrote:

> Bijan,
>
> muchos thanks for the response.  Comments inline.

You're welcome. I wish I could say "RDF and OWL solves this problem out 
of the box", but it really doesn't.

>> On Feb 13, 2004, at 8:54 PM, Mark Baker wrote:
>>
>>> On Fri, Feb 13, 2004 at 12:06:01PM -0800, David Orchard wrote:
[snip]
>>
>> Actually no. He requires "ignore unknowns" extensibilty *with*
>> validation. If you try to validate a specific profile of RDF/XML, you
>> could have similar problems.
>>
>> Presumably, you want *RDF*, not RDF/XML per se.
>
> Well, the reason that I want "ignore unknowns" is because I know that
> "ignore unknowns" has been deployed on the web for >10 years and it 
> works
> for versioning.  If there's another solution, I'm really really really
> interested in it.

The extra bit, perhaps, is the validation. Although required known 
fields are ubiquitous where you have ignore unknowns :)

>>> I understand that there's pushback against RDF/XML in WS circles,
>>
>> Not from me, semantic web person that I am :)
>
> and that raises my opinion of you significantly.

I'm in research, not marketing.

>>> but
>>> really, solving this problem is *exactly* what RDF was designed for.
>>
>> Acutally only sort of. XML was, in part, similarly designed.
>> Insofar as
>> both are coming form the semistructured data cmmunity (which is more
>> true for XML, actually), they tend to have been built to handle such
>> problems. XML Schema much less so. And OWL and RDF(S) are 1) not
>> *really* aiming at this and 2) have deep difficulties with *data*
>> *validation* (see current threads on public-sws-ig).
>
> Bijan, could you provide some of the examples of the difficulties?

Easily, and I did so in a reply to Mark, but here's another:

<owl:Class rdf:ID="Parent">
	<rdfs:subClassOf>
		<owl:Restriction>
			<owl:onProperty>
				<owl:ObjectProperty rdf:ID="hasChild"/>
			</owl:onProperty>
			<owl:someValuesFrom>
				<owl:Class rdf:ID="Person"/>
			</owl:someValuesFrom>
		</owl:Restriction>
	</rdfs:subClassOf>
</owl:Class>

The above definition says that Parents have at least one child. But 
what does this mean? It means if you know someone is a parent, then you 
know they have at least one child. You also know that if someone has a 
child, then they are a parent. And you know that if they don't have any 
children, then they aren't a parent. But all that is consistent with 
the kb:

<Parent rdf:ID=Bob/>

Suppose that were the only assertion in your document (i.e., that Bob 
rdf:type Parent). You can infer that there is *something* that Bob is 
related to by hasChild. That is, you can infer:

<rdf:Description rdf;about="#Bob">
	<hasChild>
		<Person/> <!--Note the blank node!!!-->
...

But how does this help you? Typically, when you validate a data record, 
you want mandatory fields to be *present* *with* values and to reject 
records that lack those fields or don't have values for those fields 
(which is, I guess, the same thing). OWL and RDF don't support that 
(except in *really* limited ways, c.f., hasValue).

>>> If you want to give me a detailed example and the
>>> versioning/extensibility requirements, I'd be happy to do the
>>> conversion to RDF/XML.
>>
>> Won't help. And wouldn't meet the requirements anyway. I mean, if you
>> want to leave it merely wellformed XML, you solve the problem too.
>
> wellformed ain't right.  I'd like the type information for valid types.

That's my point. First, by offering RDF *alone*, Mark wasn't offering 
much more, if anything, than well formed XML alone. RDF isn't *better* 
at this (much) than well-formed XML, just different (ok, it's probably 
slightly better in some respects, but that's a bit moot; it doesn't add 
validation). So the problem of ignoring unknowns in a validation 
context is completely ignored by a cry of "Use RDF".

Adding OWL as the "schema" language might sorta seem to help, except 
that OWL is about consistency and entailment, not validation, so it 
still fails to meet the validation part.

Schematron, on the other hand, I think does the job, mostly. Though it, 
by itself, doesn't provide typing *per se*. I'm not a schematron 
expert.

> So, what are the requirements:
> 1. Types that are valid have type information
> 2. Types that are not known do not break validation
> 3. Types allow for arbitrary extensibilty in ways not predicted by the
> Version N schema author.
> 4. Types that are not known and optional can be added without breaking
> compatibility (same as #2?)
> 5. Types that are known and not allowed break validation.
>
> Assuming that these are roughly the requirements for doing compatibile
> versioning, Bijan, what would the RDF/XML look like to express these
> assurances?

Can't. Not even with OWL.

Well, actually, it's tricky. But for most intents and purposes, I think 
my blunt answer is correct.

The hard bit is what "valid" means. You can check the consistency of an 
RDF knowledge base/document/graph, but what does it mean to validate 
it? *All* RDF/RDFS documents are, by default, consistent. So that 
doesn't help rule out "bad" documents. (Indeed, the results can be 
rather surprising.) There's nothing preventing an individual fluffy 
from being both a cat and a dog and a wisp of fresh air. In OWL, you 
could declare these classes disjoint which would mean that a kb in 
which fluffy were declared to be a member of all three classes would be 
inconsistent...but that doesn't tell you much about fluffy.

Aside from that (and that's a HUGE aside), I think OWL meets most of 
these criteria. But really, it's a different world. If you were to add 
transitive closure and well-foundedness (as some description logics do) 
then you would be equiped to say quite a few interesting things about 
XML documents and schemas (there is good work in this area, at least 
wrt DTDs: http://citeseer.nj.nec.com/225538.html). But OWL doesn't have 
those at the moment.

>  How about taking the V1 (name(first,last)) and V2
> (name(first,last,middle)) examples.

You can make a class "Name" who's members must have a first and last 
property. That doesn't preclude Name from also having members that have 
a middle property. You can even synthesize that class with a class 
expression, i.e., intersectionOf(Name, Restriction onProperty=middle, 
cardinality=1). So, you do have the flexibilty in one direction that 
you want, but this doesn't ensure that a consistent document with a 
member of class Name will have an explicit triple with that member as 
subject, first as the property, and a specific value as the object.

This is just a sketch. I'd happily work it out in detail, if you'd like.

> And thanks for the time to educated a SW-philistine like myself.  It's 
> so
> rare to encounter a SW person who doesn't say "drink the kool-aid" 
> whenever
> possible that I see this as an opportunity to get educated.

I recommend the pointers to public-sws-ig that I posted in my response 
to Mark. There will be more discussions over the next couple of weeks 
as I try to figure out how to most usefully connect OWL with WSDL.

Cheers,
Bijan Parsia.
Received on Sunday, 15 February 2004 11:59:10 UTC