Re: Re: Detailed comments on new default mapping draft

* Juan Sequeda <juanfederico@gmail.com> [2010-11-04 08:06-0500]
> Eric,
> 
> http://www.w3.org/2001/sw/rdb2rdf/directGraph/ is not working

could you describe the problem? it may be from when I attempted to
merge our documents (and added you as authors). do you see anything?
do you get a 404?..

> Juan Sequeda
> +1-575-SEQ-UEDA
> www.juansequeda.com
> 
> 
> On Thu, Nov 4, 2010 at 12:44 AM, Eric Prud'hommeaux <eric@w3.org> wrote:
> 
> > Per Richard's suggestion, I've incorporated most of Richard's comments
> > into Rev 1.45 of http://www.w3.org/2001/sw/rdb2rdf/directGraph/ .
> >
> >
> > * Juan Sequeda <juanfederico@gmail.com> [2010-11-02 14:19-0500]
> > > Richard,
> > >
> > > Great comments! Thanks a lot. I'm glad you took the time to read the
> > whole
> > > thing. You even found typos, which means that you really took a
> > magnifying
> > > glass with you.
> > >
> > > Looking forward to other comments.
> > >
> > > Marcelo and I just had a long meeting and we discussed each of your
> > points.
> > > Comments are inline.
> > >
> > >
> > > On Tue, Nov 2, 2010 at 9:10 AM, Richard Cyganiak <richard@cyganiak.de
> > >wrote:
> > >
> > > > Marcelo, Eric,
> > > >
> > > > I am commenting on Section 2 of
> > > > http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt
> > > > $Id: alt.xml,v 1.2 2010/10/30 03:39:01 marenas Exp $
> > > >
> > > > First of all, great work! Looks like we almost have an FPWD here.
> > > >
> > > > Detailed comments are below. A lot of it is editorial, but there are
> > some
> > > > substantial comments too, as well as some pointers to oversights. I
> > would
> > > > appreciate if each comment could be a) addressed in the text, or b)
> > > > reflected as an @@Issue in the text, or c) replied to in a response to
> > this
> > > > email, or d) turned into an Issue in the W3C tracker.
> > > >
> > >
> > > I'm commenting everything inline in this email. Some things would need to
> > be
> > > turned into Issues in the W3C tracker (how to do this?)
> > >
> > > >
> > > >
> > > >  Stem URI    this should be called  base URI , because that's a
> > commonly
> > > > understood term, and it enables the explanation of URI generation as
> > > > resolution of a relative URI against a base URI.
> > > >
> > >
> > > Ok. We will change this.
> >
> > I prefer to keep stem URI distinct from the relative URI, at least for
> > FPWD. base, per 2396 <http://tools.ietf.org/html/rfc2396#section-1.4>
> > has a specific behavoir, for example, a relative URI <People/ID.7#_>
> > resolved against a base URI of <http://foo.example/DB> yields
> > <http://foo.example/People/ID.7#_>. If we remedy that with a leading
> > slash, </People/ID.7#_>, we get something resolved against the root.
> >
> >
> > > > The document should use SQL terminology throughout. Relation, attribute
> > and
> > > > tuple should be table, column and row, etc.
> > > >
> > >
> > > Ok. We will change this.
> >
> > I think this is addressed §1-3. §4 uses a more traditional
> > terminology, but relates it to SQL terms with the following text:
> > [[
> > There are many models for databases in SQL literature; because the
> > Direct Mapping does not rely on column position, we use a model which
> > assumes a 1:1 correspondance between attribute (column name) and
> > value, i.e. a map.
> > Starting with a traditional model of a relational database we define a
> > Relation (a table) which has a name, a Header, Body and
> > primary/foreign key details.
> > The Body contains maps from attribute names to values and the Header
> > provides the datatypes to interpret those values.
> > ]]
> >
> > editorial suggestions?
> >
> >
> > > > The approach in Section 2 defines URIs for columns and rows, but not
> > for
> > > > tables. This means one has to use hacks to do a SPARQL query for all
> > records
> > > > in a given table. The approach needs to define URIs for tables as well,
> > and
> > > > associate each row with the table it is from.
> > > >
> > >
> > > If we understand correctly, we would need to create IRIs for Tables.
> > Hence,
> > > there would be now three types of IRIs: Tuple, Columns and Tables.
> > However,
> > > if we are to create Table IRIs, then we also need to create a new type of
> > > triples: Table Triples:
> > >
> > > <TupleIRI, rdf:type, Table IRI>
> > >
> > > Do we agree?
> >
> > @@ will take a bit of work @@
> >
> >
> > > > From the current description, it is impossible to work out how URIs for
> > > > rows with multi-column primary keys would look like. What order? What
> > > > separator characters?
> > > >
> > >
> > > The order is the same order of the columns in the table.  Marcelo and I
> > have
> > > the following proposal for creating IRIs:
> > >
> > > Table IRI
> > >
> > > baseURI/table i.e baseURI/person
> > >
> > > Column IRI
> > >
> > > baseURI/table/column i.e baseURI/person/name
> > >
> > > Multicolumn IRI
> > >
> > > baseURI/table/column1#column2#... i.e. baseURI/person/fname#lname
> > >
> > > Tuple IRI
> > >
> > > baseURI/table/column1:value  i.e baseURI/person/id:12
> > >
> > > Multicolumn Tuple IRI
> > >
> > > baseURI/table/column1:value1#column2:value2#... i.e
> > > baseURI/person/fname:Juan#lname:Sequeda
> > >
> > >
> > > This is our proposal. However, we are not aware of the best practices for
> > > IRIs. I propose that we open an Issue on "how to generate Table, Tuple
> > and
> > > Column IRIs"
> >
> > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#rules
> >
> > now has text to specify the construction of IRIs in the predicate
> > position and the subject/object position. Editorial suggestions
> > welcome! (I'm not super-confident that there isn't a clearer way to
> > express this.)
> >
> > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#multi-key reinforces
> > that with:
> > [[
> > • The predicate for this key is formed from the stem and
> > "deptName_deptCity",
> >  reflecting the order of the column names in the foreign key.
> > ]]
> >
> > > > I am uncomfortable with the use of the dot character as a separator in
> > > > generated URIs. The character typically used in URIs to indicate a
> > > > hierarchical relationship is "/". The character typically used to
> > indicate
> > > > key-value pairs is "=".
> > > >
> > >
> > > See previous comment. We should open Issue on creating IRIs and discuss
> > this
> > > in group.
> >
> > accepted s/=/./
> >
> >
> > > >
> > > > I am uncomfortable with the use of "#_" at the end of row URIs. I
> > cannot
> > > > see any precedent for that, so I cannot call it good practice. It is
> > also
> > > > unnecessary because the URI identifies a row in a database table and
> > never a
> > > > person/address/organization or whatever other real-world object. Rows
> > in
> > > > database tables are information resources and thus there is no problem
> > at
> > > > all with identifying them using a plain fragment-less URI.
> > > >
> > > > This is what Eric had originally. So he can comment on his decision.
> > Again,
> > > we should create an Issue on this.
> >
> > I was following <http://www.w3.org/DesignIssues/RDB-RDF>, but decided to
> > shorten "personnel/employees/1234#item" to "personnel/employees/1234#_".
> > The choice of hash vs. slash is a real issue. I added an issue
> > [[
> > hash-vs-slash: This edition of this document presumes slash
> > identifiers. LOD data identifiers tend to use slash, but that slightly
> > increases implementation burden and round trips.
> > ]]
> > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#hash-vs-slash
> >
> >
> > > > Special characters in table names, column names and PK values need to
> > be
> > > > handled in the URI generation.
> > > >
> > >
> > > Again... create an issue on this ( I sound like a broken record)
> >
> > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#rules states that table
> > names, attribute names and attribute lexical values are url-encoded
> > per http://www.w3.org/TR/wsdl#_http:urlEncoded .
> >
> > > > 2.2 says:  with an XML Schema datatype corresponding to the SQL
> > datatype of
> > > > that column . That obviously needs to be spelled out. There should be
> > an
> > > > extra section to this,  Mapping SQL Datatypes to RDF Literals  or
> > something
> > > > like that. The section can be a placeholder for FPWD, but should exist.
> > > >
> > >
> > > Yes. There needs to be a table with the corresponding mapping
> >
> > added [[
> > ...the pairs of url-encoded column name '=' url-encoded column value,
> > separated by a '_'. The column value is the lexical per SQL99 [SQL99].
> > ]]
> >
> > > > I do not find the visual notation for unique keys and foreign keys
> > > > particularly clear. How about simply listing them underneath the table?
> > > >  Foreign key: addr -> Addresses.ID
> > > >
> > >
> > > Could we consider taking away the visual notation for keys, and just have
> > > the table with data. We would also put in the SQL DDL and I'm wondering
> > if
> > > this would be enough?
> >
> > I added an bit of a key before the first example. The "empty primary
> > key" example was, I believe, the worst of the lot. After struggling a
> > bit with a notation, I gave up and copied the XSD from
> >
> > https://dvcs.w3.org/hg/FeDeRate/file/060df0861705/directmapping/src/test/scala/DirectMappingTest.scala#l105
> > into
> >  http://www.w3.org/2001/sw/rdb2rdf/directGraph/#ref-no-pk
> >
> >
> > > >
> > > > You write foreign keys as if they reference another *key*. I believe
> > that
> > > > doesn't reflect SQL. Foreign keys reference other *columns*. That's the
> > > > mental model that a reader is going to have in their head, and that's
> > how it
> > > > should be presented in the spec.
> > > >
> > >
> > > I agree. To make sure, what we currently have for example Address.PK, and
> > we
> > > know that the PK of Address is ID, it should then be Address.ID (or
> > > something like that). Is that what you mean?
> >
> > Actually, I disagree; foreign keys specifically reference candidate
> > keys in other tables. If the system does not enforce that, and the
> > data in foreign keys matches more than 1 row in the referenced table,
> > then we have have a pretty different graph to represent. My temptation
> > is to start out conservative, and if we have energy and mandate,
> > represent these cases which I believe are non-compliant.
> >
> > [[
> > The columns in the referencing table must be the primary key or other
> > candidate key in the referenced table.
> > ]] — http://en.wikipedia.org/wiki/Foreign_key ¶1
> >
> >
> > > > The use of "_" as a separator between the column/value pairs in
> > > > multi-column PK row URIs is a bad idea, because the underscore
> > character is
> > > > ubiquitous in table and column names. An obvious replacement would be
> > ";".
> > > >
> > >
> > >
> > > Again.. we need to create an Issue about generating IRIs :P
> >
> > The prob here is that we don't want to step on either a valid fragment
> > identifier (for e.g. turtle) or xml local name (for RDF/XML). I've left
> > the "_" until we have a new idea.
> >
> > Note that url-encoding the column names and lexical values protects us
> > from seing the "_"s in e.g. f_name Bob_Smith.
> >
> >
> > > > http://foo.example/DB/Department#Manager -- why is Manager uppercase?
> > > >
> > >
> > > typo
> >
> > ditto
> >
> > > >
> > > > I object to the representation of simple string literals as
> > > > "Cambridge"^^xsd:string. This should simply be "Cambridge". They are
> > > > equivalent under datatype semantics, so the simple form should be used.
> > > >
> > >
> > > We should create an issue on this: "Should a literal include xsd?" Should
> > be
> > > discussed in group and come to a consensus.
> >
> > http://www.w3.org/2001/sw/rdb2rdf/directGraph/#literaltriples
> > now says
> > [[
> > Per XML Datatypes for SQL Datatypes, string datatypes are expressed as
> > an RDF plain literal.
> > ]]
> >
> >
> > > > Again, please drop the concept of a stem URI and explain that the
> > mapping
> > > > uses relative URIs which are resolved against an environment-provided
> > base
> > > > URI. Instead of this:
> > > >
> > > > <http://foo.example/DB/Addresses/ID.18#_> <
> > > > http://foo.example/DB/Addresses#ID> 18^^xsd:integer .
> > > > <http://foo.example/DB/Addresses/ID.18#_> <
> > > > http://foo.example/DB/Addresses#city> "Cambridge"^^xsd:string .
> > > > <http://foo.example/DB/Addresses/ID.18#_> <
> > > > http://foo.example/DB/Addresses#state> "MA"^^xsd:string .
> > > >
> > > > I'd like to see this:
> > > >
> > > > <Addresses/ID=18> <Addresses#ID> 18 .
> > > > <Addresses/ID=18> <Addresses#city> "Cambridge" .
> > > > <Addresses/ID=18> <Addresses#state> "MA" .
> >
> > Hmm, it's possible that this might all be do-able with relative URIs.
> > But we'd better think about this carefully.
> > For now, I've used turtle's @base attribute.
> >
> >
> > > Do you mean that we should define a prefix:
> > >
> > > @prefix base: <http://foo.example/DB/> .
> > >
> > > and then everywhere have
> > >
> > > <base:Addresses/ID=18> <base:Addresses#ID> 18 .
> > > <base:Addresses/ID=18> <base:Addresses#city> "Cambridge" .
> > > <base:Addresses/ID=18> <base:Addresses#state> "MA" .
> > >
> > >
> > >
> > >
> > >
> > > > If you do it right, RDF can be simple ;-)
> > > >
> > > >
> > > :)
> > >
> > >
> > > >
> > > > Again, great work, and I'm very happy to see this spec moving forward
> > and
> > > > like the direction it is taking.
> > > >
> > >
> > > Thanks for you very insightful and direct comments.
> > >
> > > Marcelo and I will be working on this in the next couple of days and let
> > > everybody know when we have an update. Please keep the comments
> > coming!!!!!
> > >
> > >
> > > > Richard
> > > >
> > > >
> > > >
> > > >
> > > >> All the best,
> > > >>
> > > >> Marcelo
> > > >>
> > > >>
> > > >
> > > >
> > --
> > -ericP
> >

-- 
-ericP

Received on Thursday, 4 November 2010 13:20:42 UTC