- From: Bijan Parsia <bparsia@isr.umd.edu>
- Date: Sun, 15 Feb 2004 11:59:05 -0500
- To: "David Orchard" <dorchard@bea.com>
- Cc: <www-ws-desc@w3.org>, "'Mark Baker'" <distobj@acm.org>
On Feb 15, 2004, at 1:31 AM, David Orchard wrote: > Bijan, > > muchos thanks for the response. Comments inline. You're welcome. I wish I could say "RDF and OWL solves this problem out of the box", but it really doesn't. >> On Feb 13, 2004, at 8:54 PM, Mark Baker wrote: >> >>> On Fri, Feb 13, 2004 at 12:06:01PM -0800, David Orchard wrote: [snip] >> >> Actually no. He requires "ignore unknowns" extensibilty *with* >> validation. If you try to validate a specific profile of RDF/XML, you >> could have similar problems. >> >> Presumably, you want *RDF*, not RDF/XML per se. > > Well, the reason that I want "ignore unknowns" is because I know that > "ignore unknowns" has been deployed on the web for >10 years and it > works > for versioning. If there's another solution, I'm really really really > interested in it. The extra bit, perhaps, is the validation. Although required known fields are ubiquitous where you have ignore unknowns :) >>> I understand that there's pushback against RDF/XML in WS circles, >> >> Not from me, semantic web person that I am :) > > and that raises my opinion of you significantly. I'm in research, not marketing. >>> but >>> really, solving this problem is *exactly* what RDF was designed for. >> >> Acutally only sort of. XML was, in part, similarly designed. >> Insofar as >> both are coming form the semistructured data cmmunity (which is more >> true for XML, actually), they tend to have been built to handle such >> problems. XML Schema much less so. And OWL and RDF(S) are 1) not >> *really* aiming at this and 2) have deep difficulties with *data* >> *validation* (see current threads on public-sws-ig). > > Bijan, could you provide some of the examples of the difficulties? Easily, and I did so in a reply to Mark, but here's another: <owl:Class rdf:ID="Parent"> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty> <owl:ObjectProperty rdf:ID="hasChild"/> </owl:onProperty> <owl:someValuesFrom> <owl:Class rdf:ID="Person"/> </owl:someValuesFrom> </owl:Restriction> </rdfs:subClassOf> </owl:Class> The above definition says that Parents have at least one child. But what does this mean? It means if you know someone is a parent, then you know they have at least one child. You also know that if someone has a child, then they are a parent. And you know that if they don't have any children, then they aren't a parent. But all that is consistent with the kb: <Parent rdf:ID=Bob/> Suppose that were the only assertion in your document (i.e., that Bob rdf:type Parent). You can infer that there is *something* that Bob is related to by hasChild. That is, you can infer: <rdf:Description rdf;about="#Bob"> <hasChild> <Person/> <!--Note the blank node!!!--> ... But how does this help you? Typically, when you validate a data record, you want mandatory fields to be *present* *with* values and to reject records that lack those fields or don't have values for those fields (which is, I guess, the same thing). OWL and RDF don't support that (except in *really* limited ways, c.f., hasValue). >>> If you want to give me a detailed example and the >>> versioning/extensibility requirements, I'd be happy to do the >>> conversion to RDF/XML. >> >> Won't help. And wouldn't meet the requirements anyway. I mean, if you >> want to leave it merely wellformed XML, you solve the problem too. > > wellformed ain't right. I'd like the type information for valid types. That's my point. First, by offering RDF *alone*, Mark wasn't offering much more, if anything, than well formed XML alone. RDF isn't *better* at this (much) than well-formed XML, just different (ok, it's probably slightly better in some respects, but that's a bit moot; it doesn't add validation). So the problem of ignoring unknowns in a validation context is completely ignored by a cry of "Use RDF". Adding OWL as the "schema" language might sorta seem to help, except that OWL is about consistency and entailment, not validation, so it still fails to meet the validation part. Schematron, on the other hand, I think does the job, mostly. Though it, by itself, doesn't provide typing *per se*. I'm not a schematron expert. > So, what are the requirements: > 1. Types that are valid have type information > 2. Types that are not known do not break validation > 3. Types allow for arbitrary extensibilty in ways not predicted by the > Version N schema author. > 4. Types that are not known and optional can be added without breaking > compatibility (same as #2?) > 5. Types that are known and not allowed break validation. > > Assuming that these are roughly the requirements for doing compatibile > versioning, Bijan, what would the RDF/XML look like to express these > assurances? Can't. Not even with OWL. Well, actually, it's tricky. But for most intents and purposes, I think my blunt answer is correct. The hard bit is what "valid" means. You can check the consistency of an RDF knowledge base/document/graph, but what does it mean to validate it? *All* RDF/RDFS documents are, by default, consistent. So that doesn't help rule out "bad" documents. (Indeed, the results can be rather surprising.) There's nothing preventing an individual fluffy from being both a cat and a dog and a wisp of fresh air. In OWL, you could declare these classes disjoint which would mean that a kb in which fluffy were declared to be a member of all three classes would be inconsistent...but that doesn't tell you much about fluffy. Aside from that (and that's a HUGE aside), I think OWL meets most of these criteria. But really, it's a different world. If you were to add transitive closure and well-foundedness (as some description logics do) then you would be equiped to say quite a few interesting things about XML documents and schemas (there is good work in this area, at least wrt DTDs: http://citeseer.nj.nec.com/225538.html). But OWL doesn't have those at the moment. > How about taking the V1 (name(first,last)) and V2 > (name(first,last,middle)) examples. You can make a class "Name" who's members must have a first and last property. That doesn't preclude Name from also having members that have a middle property. You can even synthesize that class with a class expression, i.e., intersectionOf(Name, Restriction onProperty=middle, cardinality=1). So, you do have the flexibilty in one direction that you want, but this doesn't ensure that a consistent document with a member of class Name will have an explicit triple with that member as subject, first as the property, and a specific value as the object. This is just a sketch. I'd happily work it out in detail, if you'd like. > And thanks for the time to educated a SW-philistine like myself. It's > so > rare to encounter a SW person who doesn't say "drink the kool-aid" > whenever > possible that I see this as an opportunity to get educated. I recommend the pointers to public-sws-ig that I posted in my response to Mark. There will be more discussions over the next couple of weeks as I try to figure out how to most usefully connect OWL with WSDL. Cheers, Bijan Parsia.
Received on Sunday, 15 February 2004 11:59:10 UTC