RE: data smushing

David,

> The dc:title properties are literals, but the things identified as
> http://foo.com/xxx and http://bar.com/yyy are entities (what RDF calls
> "resources").  Are the both trying to describe the same physical
> person, or not?  

Ah, those two. Not knowable without a mapping, either explicit or implicit.
Some good heuristics sure, but heuristics aren't good enough for some
domains and actions. Even a person in a Chinese room would expect you to
specify that, or a least give a hint :) There isn't a general solution to
this (speak up now if you know otherwise) since there isn't a general
solution for dereferencing URIs. In this case, if the user story is to
identify the people being described by resources, then I'd say the data and
processing models need tuning.


>From my (moderate) experience, this seems to be
> problemo numero uno for data companies, and it will get even trickier
> when we move from closed databases to the open Web.

I don't dispute this at all. But I do believe good data models and
conforming processors will take you a long way. I think a lot of the
perceived problems with RDF in this area are a result of the fact that RDF
is a data model constantly discussed without reference to a procedural model
(but sometimes eloquent handwaving). You clearly need both ("A Critique of
Pure Reason" by Drew McDermott is pretty much the last word on this). But
since you came up with SAX, I guess you appreciate this already :) It's
interesting and good that XML Topic Maps is specifying a processing model
(http://www.topicmaps.org/xtm/1.0/xtmp1.html). 

In the medium run, I think you'll need and want directories holding alias
resource bindings to names to establish these equivalences. It's not exactly
cool but it'll work for the most part. I can't see any of this stuff
(syntactic, semantic or pragmatic web) working without directory services. 


> Even if I used resource properties, you couldn't be sure: more than
> one person can have the same mother (for example).

How so? That's just two resources bound by a relationship (which is probably
also a resource with constraints I hope :). If some resource has two
mothers, or your processor disallows one to many relationships, well... This
stuff can be done for simpler domains and relationships. If you have time,
take a look at the proposed conceptual graphs ansi standard
<http://www.bestweb.net/~sowa/cg/cgstand.htm> from John Sowa (the ontology
stuff on his site is also interesting). CG's are good for this kind of work
and then serialize into RDF ...

-Bill

-----
Bill de hÓra  :  InterX  :  bdehora@interx.com

 

Received on Wednesday, 3 January 2001 10:06:45 UTC