- From: Juan Sequeda <juanfederico@gmail.com>
- Date: Thu, 4 Nov 2010 08:06:03 -0500
- To: "Eric Prud'hommeaux" <eric@w3.org>
- Cc: Richard Cyganiak <richard@cyganiak.de>, Marcelo Arenas <marcelo.arenas1@gmail.com>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
- Message-ID: <AANLkTikQecBJi=SqHux+qXfxhNwR8_Cm6VPKJ5xUipf0@mail.gmail.com>
Eric, http://www.w3.org/2001/sw/rdb2rdf/directGraph/ is not working Juan Sequeda +1-575-SEQ-UEDA www.juansequeda.com On Thu, Nov 4, 2010 at 12:44 AM, Eric Prud'hommeaux <eric@w3.org> wrote: > Per Richard's suggestion, I've incorporated most of Richard's comments > into Rev 1.45 of http://www.w3.org/2001/sw/rdb2rdf/directGraph/ . > > > * Juan Sequeda <juanfederico@gmail.com> [2010-11-02 14:19-0500] > > Richard, > > > > Great comments! Thanks a lot. I'm glad you took the time to read the > whole > > thing. You even found typos, which means that you really took a > magnifying > > glass with you. > > > > Looking forward to other comments. > > > > Marcelo and I just had a long meeting and we discussed each of your > points. > > Comments are inline. > > > > > > On Tue, Nov 2, 2010 at 9:10 AM, Richard Cyganiak <richard@cyganiak.de > >wrote: > > > > > Marcelo, Eric, > > > > > > I am commenting on Section 2 of > > > http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt > > > $Id: alt.xml,v 1.2 2010/10/30 03:39:01 marenas Exp $ > > > > > > First of all, great work! Looks like we almost have an FPWD here. > > > > > > Detailed comments are below. A lot of it is editorial, but there are > some > > > substantial comments too, as well as some pointers to oversights. I > would > > > appreciate if each comment could be a) addressed in the text, or b) > > > reflected as an @@Issue in the text, or c) replied to in a response to > this > > > email, or d) turned into an Issue in the W3C tracker. > > > > > > > I'm commenting everything inline in this email. Some things would need to > be > > turned into Issues in the W3C tracker (how to do this?) > > > > > > > > > > > Stem URI this should be called base URI , because that's a > commonly > > > understood term, and it enables the explanation of URI generation as > > > resolution of a relative URI against a base URI. > > > > > > > Ok. We will change this. > > I prefer to keep stem URI distinct from the relative URI, at least for > FPWD. base, per 2396 <http://tools.ietf.org/html/rfc2396#section-1.4> > has a specific behavoir, for example, a relative URI <People/ID.7#_> > resolved against a base URI of <http://foo.example/DB> yields > <http://foo.example/People/ID.7#_>. If we remedy that with a leading > slash, </People/ID.7#_>, we get something resolved against the root. > > > > > The document should use SQL terminology throughout. Relation, attribute > and > > > tuple should be table, column and row, etc. > > > > > > > Ok. We will change this. > > I think this is addressed §1-3. §4 uses a more traditional > terminology, but relates it to SQL terms with the following text: > [[ > There are many models for databases in SQL literature; because the > Direct Mapping does not rely on column position, we use a model which > assumes a 1:1 correspondance between attribute (column name) and > value, i.e. a map. > Starting with a traditional model of a relational database we define a > Relation (a table) which has a name, a Header, Body and > primary/foreign key details. > The Body contains maps from attribute names to values and the Header > provides the datatypes to interpret those values. > ]] > > editorial suggestions? > > > > > The approach in Section 2 defines URIs for columns and rows, but not > for > > > tables. This means one has to use hacks to do a SPARQL query for all > records > > > in a given table. The approach needs to define URIs for tables as well, > and > > > associate each row with the table it is from. > > > > > > > If we understand correctly, we would need to create IRIs for Tables. > Hence, > > there would be now three types of IRIs: Tuple, Columns and Tables. > However, > > if we are to create Table IRIs, then we also need to create a new type of > > triples: Table Triples: > > > > <TupleIRI, rdf:type, Table IRI> > > > > Do we agree? > > @@ will take a bit of work @@ > > > > > From the current description, it is impossible to work out how URIs for > > > rows with multi-column primary keys would look like. What order? What > > > separator characters? > > > > > > > The order is the same order of the columns in the table. Marcelo and I > have > > the following proposal for creating IRIs: > > > > Table IRI > > > > baseURI/table i.e baseURI/person > > > > Column IRI > > > > baseURI/table/column i.e baseURI/person/name > > > > Multicolumn IRI > > > > baseURI/table/column1#column2#... i.e. baseURI/person/fname#lname > > > > Tuple IRI > > > > baseURI/table/column1:value i.e baseURI/person/id:12 > > > > Multicolumn Tuple IRI > > > > baseURI/table/column1:value1#column2:value2#... i.e > > baseURI/person/fname:Juan#lname:Sequeda > > > > > > This is our proposal. However, we are not aware of the best practices for > > IRIs. I propose that we open an Issue on "how to generate Table, Tuple > and > > Column IRIs" > > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#rules > > now has text to specify the construction of IRIs in the predicate > position and the subject/object position. Editorial suggestions > welcome! (I'm not super-confident that there isn't a clearer way to > express this.) > > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#multi-key reinforces > that with: > [[ > • The predicate for this key is formed from the stem and > "deptName_deptCity", > reflecting the order of the column names in the foreign key. > ]] > > > > I am uncomfortable with the use of the dot character as a separator in > > > generated URIs. The character typically used in URIs to indicate a > > > hierarchical relationship is "/". The character typically used to > indicate > > > key-value pairs is "=". > > > > > > > See previous comment. We should open Issue on creating IRIs and discuss > this > > in group. > > accepted s/=/./ > > > > > > > > I am uncomfortable with the use of "#_" at the end of row URIs. I > cannot > > > see any precedent for that, so I cannot call it good practice. It is > also > > > unnecessary because the URI identifies a row in a database table and > never a > > > person/address/organization or whatever other real-world object. Rows > in > > > database tables are information resources and thus there is no problem > at > > > all with identifying them using a plain fragment-less URI. > > > > > > This is what Eric had originally. So he can comment on his decision. > Again, > > we should create an Issue on this. > > I was following <http://www.w3.org/DesignIssues/RDB-RDF>, but decided to > shorten "personnel/employees/1234#item" to "personnel/employees/1234#_". > The choice of hash vs. slash is a real issue. I added an issue > [[ > hash-vs-slash: This edition of this document presumes slash > identifiers. LOD data identifiers tend to use slash, but that slightly > increases implementation burden and round trips. > ]] > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#hash-vs-slash > > > > > Special characters in table names, column names and PK values need to > be > > > handled in the URI generation. > > > > > > > Again... create an issue on this ( I sound like a broken record) > > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#rules states that table > names, attribute names and attribute lexical values are url-encoded > per http://www.w3.org/TR/wsdl#_http:urlEncoded . > > > > 2.2 says: with an XML Schema datatype corresponding to the SQL > datatype of > > > that column . That obviously needs to be spelled out. There should be > an > > > extra section to this, Mapping SQL Datatypes to RDF Literals or > something > > > like that. The section can be a placeholder for FPWD, but should exist. > > > > > > > Yes. There needs to be a table with the corresponding mapping > > added [[ > ...the pairs of url-encoded column name '=' url-encoded column value, > separated by a '_'. The column value is the lexical per SQL99 [SQL99]. > ]] > > > > I do not find the visual notation for unique keys and foreign keys > > > particularly clear. How about simply listing them underneath the table? > > > Foreign key: addr -> Addresses.ID > > > > > > > Could we consider taking away the visual notation for keys, and just have > > the table with data. We would also put in the SQL DDL and I'm wondering > if > > this would be enough? > > I added an bit of a key before the first example. The "empty primary > key" example was, I believe, the worst of the lot. After struggling a > bit with a notation, I gave up and copied the XSD from > > https://dvcs.w3.org/hg/FeDeRate/file/060df0861705/directmapping/src/test/scala/DirectMappingTest.scala#l105 > into > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#ref-no-pk > > > > > > > > You write foreign keys as if they reference another *key*. I believe > that > > > doesn't reflect SQL. Foreign keys reference other *columns*. That's the > > > mental model that a reader is going to have in their head, and that's > how it > > > should be presented in the spec. > > > > > > > I agree. To make sure, what we currently have for example Address.PK, and > we > > know that the PK of Address is ID, it should then be Address.ID (or > > something like that). Is that what you mean? > > Actually, I disagree; foreign keys specifically reference candidate > keys in other tables. If the system does not enforce that, and the > data in foreign keys matches more than 1 row in the referenced table, > then we have have a pretty different graph to represent. My temptation > is to start out conservative, and if we have energy and mandate, > represent these cases which I believe are non-compliant. > > [[ > The columns in the referencing table must be the primary key or other > candidate key in the referenced table. > ]] — http://en.wikipedia.org/wiki/Foreign_key ¶1 > > > > > The use of "_" as a separator between the column/value pairs in > > > multi-column PK row URIs is a bad idea, because the underscore > character is > > > ubiquitous in table and column names. An obvious replacement would be > ";". > > > > > > > > > Again.. we need to create an Issue about generating IRIs :P > > The prob here is that we don't want to step on either a valid fragment > identifier (for e.g. turtle) or xml local name (for RDF/XML). I've left > the "_" until we have a new idea. > > Note that url-encoding the column names and lexical values protects us > from seing the "_"s in e.g. f_name Bob_Smith. > > > > > http://foo.example/DB/Department#Manager -- why is Manager uppercase? > > > > > > > typo > > ditto > > > > > > > I object to the representation of simple string literals as > > > "Cambridge"^^xsd:string. This should simply be "Cambridge". They are > > > equivalent under datatype semantics, so the simple form should be used. > > > > > > > We should create an issue on this: "Should a literal include xsd?" Should > be > > discussed in group and come to a consensus. > > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#literaltriples > now says > [[ > Per XML Datatypes for SQL Datatypes, string datatypes are expressed as > an RDF plain literal. > ]] > > > > > Again, please drop the concept of a stem URI and explain that the > mapping > > > uses relative URIs which are resolved against an environment-provided > base > > > URI. Instead of this: > > > > > > <http://foo.example/DB/Addresses/ID.18#_> < > > > http://foo.example/DB/Addresses#ID> 18^^xsd:integer . > > > <http://foo.example/DB/Addresses/ID.18#_> < > > > http://foo.example/DB/Addresses#city> "Cambridge"^^xsd:string . > > > <http://foo.example/DB/Addresses/ID.18#_> < > > > http://foo.example/DB/Addresses#state> "MA"^^xsd:string . > > > > > > I'd like to see this: > > > > > > <Addresses/ID=18> <Addresses#ID> 18 . > > > <Addresses/ID=18> <Addresses#city> "Cambridge" . > > > <Addresses/ID=18> <Addresses#state> "MA" . > > Hmm, it's possible that this might all be do-able with relative URIs. > But we'd better think about this carefully. > For now, I've used turtle's @base attribute. > > > > Do you mean that we should define a prefix: > > > > @prefix base: <http://foo.example/DB/> . > > > > and then everywhere have > > > > <base:Addresses/ID=18> <base:Addresses#ID> 18 . > > <base:Addresses/ID=18> <base:Addresses#city> "Cambridge" . > > <base:Addresses/ID=18> <base:Addresses#state> "MA" . > > > > > > > > > > > > > If you do it right, RDF can be simple ;-) > > > > > > > > :) > > > > > > > > > > Again, great work, and I'm very happy to see this spec moving forward > and > > > like the direction it is taking. > > > > > > > Thanks for you very insightful and direct comments. > > > > Marcelo and I will be working on this in the next couple of days and let > > everybody know when we have an update. Please keep the comments > coming!!!!! > > > > > > > Richard > > > > > > > > > > > > > > >> All the best, > > >> > > >> Marcelo > > >> > > >> > > > > > > > -- > -ericP >
Received on Thursday, 4 November 2010 13:06:59 UTC