- From: naudts guido <naudts_vannoten@yahoo.com>
- Date: Wed, 3 Nov 2004 08:28:29 -0800 (PST)
- To: "Jones, David H" <david.h.jones@boeing.com>, public-cwm-talk@w3.org
- Cc: naudts guido <naudts_vannoten@yahoo.com>
Hallo, I've put some remarks after your lines. I actually made two proposals: one for existing db's and one for specific triplestores. Triplestores recieve triples and respond to queries by sending triples: in a triple store schema's, tables and columns are not important: all resources are identified uniquely by their uri that was stored in the db. Guido. --- "Jones, David H" <david.h.jones@boeing.com> wrote: > This emails give additional background and > information on a proposal for an interface between > cwm and RDBMSs. In addition I will compare this > proposal with ideas contributed by Guido Naudts, > which I believe suggest a tighter integration. > > Motivation: > > The motivation for the proposal for a cwm module to > load RDBMS data is an interoperability scenario > where there are many heterogeneous data sources > storing related information. This information is > used in a variety of business processes that need to > combine data from different source, perform some > reasoning/calculations, make some decision, and > possibly update one or more of the data sources. > > The goal of the RDB proposal is to support the > loading of rdb records into the cwm triple store. > Once loaded into the store, various things could be > done: > - Save as n3/rdf for publishing purposes > - Translate portions of the store to conform to one > or more external ontologies. > - Do general reasoning to support semi-automated > task execution > - Explicitly update the database by doing sql > insert/update operation. > > The data loaded in cwm is essentially a snapshot of > the data source, and there is not effort to > synchronize data between loads. > > There are obvious limitations in the size of the > data that could be loaded into an in-memory triple > store. It is assumed that it is the user's > responsibility to load data within the constraints > of their computer. > > In the next section I try to contrast differences > between what Guido and I are envisaging: > > - I am assuming that a person using this builtin > would want to see the rdb data as instances of one > or more classes. The user provides the class name to > handle cases where the query has a join. I'm not sure about the form in which you see the data returned? Do you expect to get back a list of resources or a list of triples? If you speak of classes I expect that you want to recieve a list of resources? > > - My proposal creates property names by > concatenating class name and column name. This > handles collisions where two tables may have the > same column name (a rather common occurrence). It is > also possible to have identical table name/column > name in different schemas of the same database. > This could be handled in 2 ways: > - Prepend the schema name to the class and column > name > - Create a different connection with a > different base uri. > Since this duplication in different schemas would > be the exception rather than the rule, I would > suggest the 2nd choice. When handling existing db's I propose a different uri for each table. The retirved resources (if not constants) are prefixed with the uri (uri:something). > > - My proposal is intended to support loading of the > current triple store from rdb sources and (possibly) > explicitly updating rdb sources from the triple > store. I believe Guido is suggesting having an > alternative rdb implementation of the RDFStore, > similar to Jena. > - In my proposal a URI is generate for each > instance, based on the PK for the query. This > approach is somewhat restrictive, but produces > stable URIs which can be used for graph > superposition and classEquivalence statements. I > believe Guido is suggesting creating anonymous > triples when triples are loaded from the rdb. > > - I am proposing using rdf/rdfs constructs to > make results processible by a wider range of tools, > and because owl constructs don't seem to be > required. Guido is proposing to use owl constructs. > This is a good argument. On the other hand, using only rdf(s) how will you check eg if something is really a datatypeproperty? I mean, control on the input is limited. However, of course, different engines can (and will) probably exists for different purposes. > I actually am not sure if database update is a > reasonable goal. Ideally this could be done with > transaction management, so consistency could be > guaranteed when updating multiple databases. This > seems like an unnecessarily complex feature in an > experimental tool like cwm. As an alternative, we > could consider an update with no transaction > management, or simply defer implementation of any > update until a more compelling case is made for it. > I certainly want the possibility of writing my triples from memory to a db. Updating remote existing db's is maybe too much for CWM, however for a full blown semantic web, it will be an absolute necessity. Think of the example where an automatic payment is made by you semantic web agent: you want to be sure that the payment was really registered by the involved banks. > In summary, my proposal has a limited scope with > rather specific - and limited -- use cases. I am > assuming that no changes would be necessary to the > internals of cwm. I agree that you can achieve your proposal using only builtins what is not possible with mine. > The proposal of Guido would > implement a rdb triple store and support reasoning > across triple stores. This would be a fairly tight > integration of cwm and RDBMS. It is unclear to me > if his proposal includes dynamic queries to a > separate database. > I did not speak of dynamic queries but I see no problem with them. > ----------------------------------------------------------------------------- > Example (with slight modification from previous > email): > Command line: > Cwm rdb.n3 rdb-test.n3 --think > rdb-results.n3 > > > <<rdb.n3>> <<rdb-test.n3>> <<rdb-results.n3>> > > Regards, > > David H. Jones > Boeing Phantom Works, > Mathematics & Computing Technology > 425-865-6924 > 425-865-2964 (FAX) > david.h.jones@boeing.com > > > > ATTACHMENT part 2 application/octet-stream name=rdb.n3 > ATTACHMENT part 3 application/octet-stream name=rdb-test.n3 > ATTACHMENT part 4 application/octet-stream name=rdb-results.n3 __________________________________ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com
Received on Wednesday, 3 November 2004 16:29:00 UTC