- From: Marcelo Arenas <marcelo.arenas1@gmail.com>
- Date: Wed, 3 Nov 2010 09:53:01 -0300
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: Juan Sequeda <juanfederico@gmail.com>, "Eric Prud'hommeaux" <eric@w3.org>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Hi Richard, Thanks for the comments! Some more comments inline. On Tue, Nov 2, 2010 at 5:35 PM, Richard Cyganiak <richard@cyganiak.de> wrote: > Hi Juan, > > Thanks for the reply! Some comments inline. > > On 2 Nov 2010, at 19:19, Juan Sequeda wrote: >>> >>> The approach in Section 2 defines URIs for columns and rows, but not for >>> tables. This means one has to use hacks to do a SPARQL query for all >>> records >>> in a given table. The approach needs to define URIs for tables as well, >>> and >>> associate each row with the table it is from. >> >> If we understand correctly, we would need to create IRIs for Tables. >> Hence, >> there would be now three types of IRIs: Tuple, Columns and Tables. > > Yes. > >> However, >> if we are to create Table IRIs, then we also need to create a new type of >> triples: Table Triples: >> >> <TupleIRI, rdf:type, Table IRI> >> >> Do we agree? > > I discussed this a bit with Eric at some point, and he had some reservations > about using rdf:type here because it could have undesirable implications. I > don't really have a strong opinion on the choice of property. It could be > rdf:type or some other property especially defined for this task > (xxx:table?). The important thing for me: There should be a triple that > relates a row to its table, to make queries for all rows of a table easier. > And all the important components of a schema should have URIs, and tables > are certainly important, so they deserve a URI of their own. I agree with this. >> Column IRI >> >> baseURI/table/column i.e baseURI/person/name > > Here, the current approach (baseURI/person#name) sort of makes sense to me, > because it slightly simplifies HTTP deployment. > >> Multicolumn IRI >> >> baseURI/table/column1#column2#... i.e. baseURI/person/fname#lname > > Hashes have a very special meaning in URI syntax, so I wouldn't use them as > generic separators. Having multiple hashes in a URI is almost certainly a > bad idea. > >> Tuple IRI >> >> baseURI/table/column1:value i.e baseURI/person/id:12 > > The colon character is also quite special in URI syntax, and is generally > only used after the protocol part of the URI (http:...). > >> Multicolumn Tuple IRI >> >> baseURI/table/column1:value1#column2:value2#... i.e >> baseURI/person/fname:Juan#lname:Sequeda >> >> This is our proposal. However, we are not aware of the best practices for >> IRIs. I propose that we open an Issue on "how to generate Table, Tuple and >> Column IRIs" > > +1, we probably need to create an issue and first think about the conditions > that the solution has to satisfy. We need to take URI syntax (RFC 3986), URI > design best practices, and the requirements of linked data deployment into > account. > > I haven't thought deeply about this, but spontaneously I would like to see > “=” for connecting column names to values, and “;” or “,” to enumerate > multiple items. For the time being, we can use "=" and "," in the document. Then we will have to decide what a good notation is. >>> 2.2 is largely redundant as it only summarizes information that follows >>> in >>> more detail later. Thus the focus should be on giving a quick intro to >>> the >>> general idea, using simple language. The example is repeated twice for no >>> reason. >>> >> >> We think it is important to state the different types of triples in the >> beginning. What it important is that somebody can initially figure out >> what >> the outcome is before diving into the whole document. > > I'm ok with stating the different kinds of triples in 2.2. > >>> The verbose textual rendering of the schema is unnecessary and should be >>> removed. It says nothing that cannot be seen from the visual >>> representation. >>> Rather use that space for writing the table definition in SQL. Same for >>> other places in the document where table schemas are spelled out >>> verbally. >> >> This is fine by me. But Marcelo would like to keep the verbose text. What >> do >> others think? But we should definitely have the SQL DDL > > I would like to hear Marcelo's reasoning. If you have SQL DDL and a visual > rendering, then what does the text add? This is a matter of taste. I personally dislike examples without a text explanation (I tend to think that they were not carefully written). But I have to recognize that the example in the document is quite simple, so we could have a shorter text about the example (not mentioning, for example, the columns of the tables). Would that be OK with you? >>> I do not find the visual notation for unique keys and foreign keys >>> particularly clear. How about simply listing them underneath the table? >>> “Foreign key: addr -> Addresses.ID” >> >> Could we consider taking away the visual notation for keys, and just have >> the table with data. We would also put in the SQL DDL and I'm wondering if >> this would be enough? > > I think that would work for me, although I'd still have a slight preference > for *somehow* having the FKs and UKs present in the visual rendering. Please > keep the special color for the PK column(s), it is helpful. > >>> You write foreign keys as if they reference another *key*. I believe that >>> doesn't reflect SQL. Foreign keys reference other *columns*. That's the >>> mental model that a reader is going to have in their head, and that's how >>> it >>> should be presented in the spec. >> >> I agree. To make sure, what we currently have for example Address.PK, and >> we >> know that the PK of Address is ID, it should then be Address.ID (or >> something like that). Is that what you mean? > > Exactly! > >>> The content of 2.3.1 actually doesn't really match its title. The title >>> talks about “information in PKs”. What follows is not only about >>> information >>> in PK columns. >> >> We will change the title. How about "Generating Triples from Primary >> Keys". >> Consequently, 2.3.2 could be "Generating Triples from Foreign Keys" > > Well but 2.3.1 is not just about generating stuff from PKs! It also deals > with all the columns that are not involved in any key. That's my complaint > -- from the title you wouldn't be able to guess that this is the section > that handles the translation of normal columns to literals. Now I understand your point. What about the title "The first step of the translation process: Generating literal triples"? All the best, Marcelo
Received on Wednesday, 3 November 2010 12:53:35 UTC