- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Tue, 2 Nov 2010 20:35:54 +0000
- To: Juan Sequeda <juanfederico@gmail.com>
- Cc: Marcelo Arenas <marcelo.arenas1@gmail.com>, "Eric Prud'hommeaux" <eric@w3.org>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Hi Juan, Thanks for the reply! Some comments inline. On 2 Nov 2010, at 19:19, Juan Sequeda wrote: >> The approach in Section 2 defines URIs for columns and rows, but >> not for >> tables. This means one has to use hacks to do a SPARQL query for >> all records >> in a given table. The approach needs to define URIs for tables as >> well, and >> associate each row with the table it is from. > > If we understand correctly, we would need to create IRIs for Tables. > Hence, > there would be now three types of IRIs: Tuple, Columns and Tables. Yes. > However, > if we are to create Table IRIs, then we also need to create a new > type of > triples: Table Triples: > > <TupleIRI, rdf:type, Table IRI> > > Do we agree? I discussed this a bit with Eric at some point, and he had some reservations about using rdf:type here because it could have undesirable implications. I don't really have a strong opinion on the choice of property. It could be rdf:type or some other property especially defined for this task (xxx:table?). The important thing for me: There should be a triple that relates a row to its table, to make queries for all rows of a table easier. And all the important components of a schema should have URIs, and tables are certainly important, so they deserve a URI of their own. > Column IRI > > baseURI/table/column i.e baseURI/person/name Here, the current approach (baseURI/person#name) sort of makes sense to me, because it slightly simplifies HTTP deployment. > Multicolumn IRI > > baseURI/table/column1#column2#... i.e. baseURI/person/fname#lname Hashes have a very special meaning in URI syntax, so I wouldn't use them as generic separators. Having multiple hashes in a URI is almost certainly a bad idea. > Tuple IRI > > baseURI/table/column1:value i.e baseURI/person/id:12 The colon character is also quite special in URI syntax, and is generally only used after the protocol part of the URI (http:...). > Multicolumn Tuple IRI > > baseURI/table/column1:value1#column2:value2#... i.e > baseURI/person/fname:Juan#lname:Sequeda > > This is our proposal. However, we are not aware of the best > practices for > IRIs. I propose that we open an Issue on "how to generate Table, > Tuple and > Column IRIs" +1, we probably need to create an issue and first think about the conditions that the solution has to satisfy. We need to take URI syntax (RFC 3986), URI design best practices, and the requirements of linked data deployment into account. I haven't thought deeply about this, but spontaneously I would like to see “=” for connecting column names to values, and “;” or “,” to enumerate multiple items. >> 2.2 is largely redundant as it only summarizes information that >> follows in >> more detail later. Thus the focus should be on giving a quick intro >> to the >> general idea, using simple language. The example is repeated twice >> for no >> reason. >> > > We think it is important to state the different types of triples in > the > beginning. What it important is that somebody can initially figure > out what > the outcome is before diving into the whole document. I'm ok with stating the different kinds of triples in 2.2. >> The verbose textual rendering of the schema is unnecessary and >> should be >> removed. It says nothing that cannot be seen from the visual >> representation. >> Rather use that space for writing the table definition in SQL. Same >> for >> other places in the document where table schemas are spelled out >> verbally. > > This is fine by me. But Marcelo would like to keep the verbose text. > What do > others think? But we should definitely have the SQL DDL I would like to hear Marcelo's reasoning. If you have SQL DDL and a visual rendering, then what does the text add? >> I do not find the visual notation for unique keys and foreign keys >> particularly clear. How about simply listing them underneath the >> table? >> “Foreign key: addr -> Addresses.ID” > > Could we consider taking away the visual notation for keys, and just > have > the table with data. We would also put in the SQL DDL and I'm > wondering if > this would be enough? I think that would work for me, although I'd still have a slight preference for *somehow* having the FKs and UKs present in the visual rendering. Please keep the special color for the PK column(s), it is helpful. >> You write foreign keys as if they reference another *key*. I >> believe that >> doesn't reflect SQL. Foreign keys reference other *columns*. That's >> the >> mental model that a reader is going to have in their head, and >> that's how it >> should be presented in the spec. > > I agree. To make sure, what we currently have for example > Address.PK, and we > know that the PK of Address is ID, it should then be Address.ID (or > something like that). Is that what you mean? Exactly! >> The content of 2.3.1 actually doesn't really match its title. The >> title >> talks about “information in PKs”. What follows is not only about >> information >> in PK columns. > > We will change the title. How about "Generating Triples from Primary > Keys". > Consequently, 2.3.2 could be "Generating Triples from Foreign Keys" Well but 2.3.1 is not just about generating stuff from PKs! It also deals with all the columns that are not involved in any key. That's my complaint -- from the title you wouldn't be able to guess that this is the section that handles the translation of normal columns to literals. >> 2.3.2: The rules for referencing tables without PKs state that the >> object >> is the target row's Tuple IRI. Earlier you said that such tables >> don't have >> Tuple IRIs but blank nodes. > > When we describe a Tuple IRI, we give the case if a table doesn't > have a > primary key, then a blank node should be created. So in a way, it > may be > understood that a blank node is a Tuple IRI, which I know is > incorrect. Can > you suggest how we should go upon this. In 2.2 you could introduce the concept of a “row RDF node”, which is either a “row IRI” (what you now call tuple IRI) or a blank node. Then you'd just have to state that the object of a reference triple is the “row RDF node” of the target row, and refer to section 2.2 for figuring out what the specific node would be. >> I object to the representation of simple string literals as >> "Cambridge"^^xsd:string. This should simply be "Cambridge". They are >> equivalent under datatype semantics, so the simple form should be >> used. > > We should create an issue on this: "Should a literal include xsd?" > Should be > discussed in group and come to a consensus. +1 >> 18^^xsd:integer is not valid Turtle. This must either be >> "18"^^xsd:integer, >> or simply 18, which is just Turtle syntactic sugar for the former. >> I would >> highly prefer if the simple form was used throughout. >> > > Yes, our mistake. However using simply 18 instead of having > xsd:integer > should be part of a group discussion. See previous comment about > creating > Issue It's a different case from the previous one. "Foo" vs. "Foo"^^xsd:string is actually a difference on the RDF graph level (although in RDF semantics they are equivalent). 18 vs. "18"^^xsd:integer are identical on the RDF graph level, it's just syntactic sugar in Turtle. >> I'd like to see this: >> >> <Addresses/ID=18> <Addresses#ID> 18 . >> <Addresses/ID=18> <Addresses#city> "Cambridge" . >> <Addresses/ID=18> <Addresses#state> "MA" . >> > > Do you mean that we should define a prefix: > > @prefix base: <http://foo.example/DB/> . > > and then everywhere have > > <base:Addresses/ID=18> <base:Addresses#ID> 18 . > <base:Addresses/ID=18> <base:Addresses#city> "Cambridge" . > <base:Addresses/ID=18> <base:Addresses#state> "MA" . No -- I mean just write relative URIs instead of absolute URIs. When there is a well-defined base URI, then this makes a lot of sense. See also: http://www.w3.org/TeamSubmission/turtle/#uris So you could write this: @base <http://foo.example/DB/> . <Addresses/ID=18> <Addresses#ID> 18 . <Addresses/ID=18> <Addresses#city> "Cambridge" . <Addresses/ID=18> <Addresses#state> "MA" . That's just a shorter form for using full absolute URIs like <http://foo.example/DB/Addresses/ID=18 >. Given that the base URI is just defined once as an input to the default mapping, you wouldn't have to repeat it for each example, but just explain in the beginning where you currently introduce the “stem URI” that throughout the document, the examples will contain relative URIs, and these are to be understood as relative to the base URI. Keep up the good work! Looking forward to an updated version! Richard > > > > > >> If you do it right, RDF can be simple ;-) >> >> > :) > > >> >> Again, great work, and I'm very happy to see this spec moving >> forward and >> like the direction it is taking. >> > > Thanks for you very insightful and direct comments. > > Marcelo and I will be working on this in the next couple of days and > let > everybody know when we have an update. Please keep the comments > coming!!!!! > > >> Richard >> >> >> >> >>> All the best, >>> >>> Marcelo >>> >>> >> >>
Received on Tuesday, 2 November 2010 20:36:31 UTC