- From: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
- Date: Tue, 8 Mar 2011 10:20:49 -0500
- To: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>
All -- As we're talking about the most basic, the most common, the ultimate in "I know nothing" mapping in the Direct/Default Mapping, we must remember that this will be used against RDB schema that are not fully developed, that have not been fully considered, mapped out, planned, etc. -- as well as fully-fledged, heavily-vetted enterprise application-serving schema. Thus -- they will change. (Even the long-lived ones will change, given enough time -- or just inconvenient timing of interactions.) This means that *all* generated URIs, whether for Classes or Relationships or Attributes or Entities, will change -- Cool URIs notwithstanding. Even RowID-based URIs may change over time, due to DBMS migration from hardware to hardware, depending on the methods used. Thus, any time an RDB2RDF Ontology (i.e., a Direct Mapping Graph, the T-box) is generated from an RDB Schema -- and further, any time instance data is generated (that is, triples involving Entities described by that RDB, i.e., an Instance Data Graph -- the A-box) -- these Graphs and URIs *must* be treated as temporary. (In the end, the transformation here is not so much RDB to RDF, as it is RDB to DDB -- Relational Database to Deductive Database.) However -- once transformed to RDF, Cool URIs are strongly to be desired. Making everything a bNode is *not* helpful to Linked Data, nor any other long-term use of EAV+CR or RDF. It is useful to be able to say "this thing was once described thus, and now is described so." It is useful to be able to say *when* the original description was retrieved, or accurate, or asserted -- and the same about the *new* description. When this is done well, and as SPARQL and the Linked Data Web mature -- you will be able to say "DESCRIBE <URI> AS OF <date>" -- and get whatever was "known" about that thing as of that moment. You will also be able to ask things like "Who has lived at <address>?" or "Where has <person> lived?" or "who has owned <property>?" or "How have the Top 25 Shareholders and Board Members of the Fortune (10, 100, 500, 1000) been interconnected over the past (10, 25, 50, 100) years?" These are not the sorts of questions you can easily ask of RDBMS through SQL -- and this is part of why we want to transform the data which is now found (and should remain, for many purposes!) in those RDBMS, or at least how we interact with it. Ontologies are like source code, in many ways. Versions happen. Instance data, likewise. Everything is naturally found within some context -- and that context must be taken into consideration when you change your observational perspective. Imagine you look at an apple, and describe it today. (Red, 137 grams, 63 cubic centimeters, etc.) Now wait a month ... or a year. Describe it again. (Brown, 35 grams, 38 cubic centimeters, etc.) It's the same apple. RFID tag proves it. It's the same entity; it should be referred to by the same name/URI. But the information about it has changed. Neither description is *wrong*, *if* those AV pairs (or the graphs holding them) have time-stamp data somehow associated with them. The same hypothetical applies to RDB T-box and A-box information, and likewise to RDF T-box and A-box information. Many things last a long time, and need to be described several times at different points in their "lives." We don't always know what things have such long lives, and what things don't -- so it's best to be able to always refer to the same Entity by a definite Identifier (URI) -- even if that URI has no real meaning when it's originally minted, and even if at some point you come up with an inherently meaningful URI -- because owl:sameAs and similar special Relationships can be used to draw necessary connections over time. But these connections *cannot* be drawn when entirely ephemeral bNodes are used for those Entities, or Attributes, etc. UUID-based dereferenceable URIs are fine for such purposes as bNodes have often been used -- because UUIDs are persistent over time, and each UUDI can be forced to only ever refer to a single entity. bNodes cannot have such restrictions placed on them ... and therein lies their doom. I hope this starts to clarify what I've been talking about in our concalls. But please feel free to ask for more, or raise objections to anything you don't agree with. Discussion is usually helpful. Regards, Ted -- A: Yes. http://www.guckes.net/faq/attribution.html | Q: Are you sure? | | A: Because it reverses the logical flow of conversation. | | | Q: Why is top posting frowned upon? Ted Thibodeau, Jr. // voice +1-781-273-0900 x32 Evangelism & Support // mailto:tthibodeau@openlinksw.com // http://twitter.com/TallTed OpenLink Software, Inc. // http://www.openlinksw.com/ 10 Burlington Mall Road, Suite 265, Burlington MA 01803 http://www.openlinksw.com/weblogs/uda/ OpenLink Blogs http://www.openlinksw.com/weblogs/virtuoso/ http://www.openlinksw.com/blog/~kidehen/ Universal Data Access and Virtual Database Technology Providers
Received on Tuesday, 8 March 2011 15:21:19 UTC