- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 2 Aug 2011 16:19:33 +0200
- To: Juan Sequeda <juanfederico@gmail.com>
- Cc: Michael Hausenblas <michael.hausenblas@deri.org>, rdb2RDF WG <public-rdb2rdf-wg@w3.org>
* Juan Sequeda <juanfederico@gmail.com> [2011-08-02 08:05-0500] > Eric, all > > This is my proposal. Just a few changes, and added subsections. > > PROPOSAL: that the English definition of the direct mapping be defined > as: > > [[ > > Section 3: The Direct Mapping > > > The Direct Mapping is a formula for creating an RDF graph from the rows of > all the tables in a database. > > > A base IRI defines a web space for the labels in this graph; all labels are > generated by appending to the base. > > > The functions scalar and reference extract the scalar and reference > attributes (those participating in a foreign key) respectively: > > > dfn scalars: the attributes in a table which are NOT in any foreign key. > > > dfn references: the attributes in a table's foreign keys. > > > dfn non-unary references: the references for which the table's foreign key > is NOT composed of a single attribute. > > > Section 3.1: Generating Row Identifiers > > > SQL table and attribute identifiers compose RDF IRIs in the direct graph. > These identifiers are separated by the punctuation characters '#', ',', '/' > and '='. All SQL identifiers are escaped following URL-encoding > > < > http://www.w3.org/TR/html5/association-of-controls-and-forms.html#url-encoded-form-data > > > > except that only the above punctuation and the characters not permitted in > RDF IRIs are escaped. > > > In the direct graph, there is an identifier for each row in a database > table. If the row is in a table with a primary key, this is formed from the > table name and the attribute names and values of each attribute in the > primary key. If there is no primary key for the table, the row identifier is > a fresh blank node: > > > dfn row identifier: > > > if the table has a primary key with attributes, the relative IRI for > > the row identifier is the concatenation of the table name, '/', and > > a ','-separated concatenation of each attribute name, '=', and the > > attribute value. > > > if the table has no primary key, the row identifier is a fresh blank > > node. > > > A (potentially unary) list of attribute names in a table form a > > property IRI: > > > dfn property IRI: the concationation of the table name, '/', and a > > ','-separated concatonation of each attribute name, and a '#' at > > the end of the property IRI. > > > Section 3.2: Mapping database values to RDF Literals > > > The values in a row are mapped to RDF literals: > > > dfn literal map: a mapping from a SQL value with a datatype to an RDF > > literal with and XML Schema datatype where the RDF literal has a > > lexical value equivalent to the SQL lexical value and the datatype > > mapping is found in this table: > > > SQL XSD datatype > > ___ ____________ > > INT http://www.w3.org/TR/xmlschema-2/#integer > > FLOAT http://www.w3.org/TR/xmlschema-2/#float > > DATE http://www.w3.org/TR/xmlschema-2/#date > > TIME http://www.w3.org/TR/xmlschema-2/#time > > TIMESTAMP http://www.w3.org/TR/xmlschema-2/#dateTime > > CHAR plain literal > > VARCHAR plain literal > > STRING plain literal > > > Section 3.3: Generating RDF Triples > > > The Direct Mapping is defined by a set of mapping functions from table > > rows to RDF triples: > > > dfn direct mapping: the set of RDF triples produced by invoking the <table > mapping> on each table in a database. > > > dfn table mapping: the set of RDF triples created by invoking the <row > mapping> on each row in a table. > > > dfn row mapping: using a Row Identifier S for each row, > > the type triple: > > (S, rdf:type, <table type>) > > plus the scalar triples: > > for each attribute in the list of <scalars> where the attribute value is > non-NULL: > > (S, the <property IRI> for the attribute, the <literal map> for the > attribute value). > > plus the reference triples: > > for each list of attributes in the <non-unary references> where none of > the attribute values are NULL: > > (S, the <property IRI> for the attributes, the <row identifier> for the > referenced triple) > > ]] Thank you for the careful review and for correcting typos. Ingoring whitespace, I see: • added numbered section headings: I propose that we first agree on the definition and do markup separately. • my precious typos were corrected. I can live without them. • re-ordered dfn references: and dfn scalars: sure. • DM is for "all the tables in a database" I debated this; I didn't want to be alarm folks who would think they'd have to expose everything if they didn't want to. The alternative is to parametrize; neither is terribly attractive. I guess "all tables" is fine. • s/an SQL/a SQL/ This depends on whether you call it "S Q L" or "sequal". The SQL spec uses "an", e.g. "Effects of SQL-statements in an SQL-transaction". • row mapping defined to be over each row. The calling function <table mapping> already "invokes the <row mapping> on each row in a table" so the row mapping should just be for a single row. • capitalize "Row Identifier" in dfn row mapping. I suspect this wasn't an intended change proposal. below are the diffs and an incorporated proposal == white-space-normalized diffs == @@ -1,18 +1,23 @@ +Section 3: The Direct Mapping + The Direct Mapping is a formula for creating an RDF graph from the -rows in a table. A base IRI defines a web space for the labels in +rows of all the tables in a database. A base IRI defines a web space for the labels in this graph; all labels are generated by appending to the base. The functions scalar and reference extract the scalar and reference attributes (those participating in a foreign key) respectively: -dfn references: the attributes in a table's foreign keys. - dfn scalars: the attributes in a table which are NOT in any foreign key. +dfn references: the attributes in a table's foreign keys. + dfn non-unary references: the references for which the table's foreign key is NOT composed of a single attribute. + +Section 3.1: Generating Row Identifiers + SQL table and attribute identifiers compose RDF IRIs in the direct graph. These identifiers are separated by the punctuation characters '#', ',', '/' and '='. All SQL identifiers are escaped following URL- @@ -30,8 +35,8 @@ dfn row identifier: if the table has a primary key with attributes, the relative IRI for - the row identifier is the concationation of the table name, '/', and - a ','-separated concatonation of each attribute name, '=', and the + the row identifier is the concatenation of the table name, '/', and + a ','-separated concatenation of each attribute name, '=', and the attribute value. if the table has no primary key, the row identifier is a fresh blank @@ -44,9 +49,11 @@ ','-separated concatonation of each attribute name, and a '#' at the end of the property IRI. +Section 3.2: Mapping database values to RDF Literals + The values in a row are mapped to RDF literals: -dfn litaral map: a mapping from an SQL value with a datatype to an RDF +dfn literal map: a mapping from a SQL value with a datatype to an RDF literal with and XML Schema datatype where the RDF literal has a lexical value equivalent to the SQL lexical value and the datatype mapping is found in this table: @@ -62,16 +69,18 @@ VARCHAR plain literal STRING plain literal -The Direct Maping is defined by a set of mapping functions from table +Section 3.3: Generating RDF Triples + +The Direct Mapping is defined by a set of mapping functions from table rows to RDF triples: -dfn direct mapping: the set of triples produced by invoking the +dfn direct mapping: the set of RDF triples produced by invoking the <table mapping> on each table in a database. dfn table mapping: the set of RDF triples created by invoking the <row mapping> on each row in a table. -dfn row mapping: using a row identifier S for the row, +dfn row mapping: using a Row Identifier S for each row, the type triple: (S, rdf:type, <table type>) plus the scalar triples: == incorporated proposal == PROPOSAL: that the English definition of the direct mapping be defined as: [[ The Direct Mapping is a formula for creating an RDF graph from the rows of each table in a database. A base IRI defines a web space for the labels in this graph; all labels are generated by appending to the base. The functions scalar and reference extract the scalar and reference attributes (those participating in a foreign key) respectively: dfn scalars: the attributes in a table which are NOT in any foreign key. dfn references: the attributes in a table's foreign keys. dfn non-unary references: the references for which the table's foreign key is NOT composed of a single attribute. SQL table and attribute identifiers compose RDF IRIs in the direct graph. These identifiers are separated by the punctuation characters '#', ',', '/' and '='. All SQL identifiers are escaped following URL- encoding <http://www.w3.org/TR/html5/association-of-controls-and-forms.html#url-encoded-form-data> except that only the above punctuation and the characters not permitted in RDF IRIs are escaped. In the direct graph, there is an identifier for each row in a database table. If the row is in a table with a primary key, this is formed from the table name and the attribute names and values of each attribute in the primary key. If there is no primary key for the table, the row identifier is a fresh blank node: dfn row identifier: if the table has a primary key with attributes, the relative IRI for the row identifier is the concatenation of the table name, '/', and a ','-separated concatenation of each attribute name, '=', and the attribute value. if the table has no primary key, the row identifier is a fresh blank node. A (potentially unary) list of attribute names in a table form a property IRI: dfn property IRI: the concationation of the table name, '/', and a ','-separated concatonation of each attribute name, and a '#' at the end of the property IRI. The values in a row are mapped to RDF literals: dfn literal map: a mapping from an SQL value with a datatype to an RDF literal with and XML Schema datatype where the RDF literal has a lexical value equivalent to the SQL lexical value and the datatype mapping is found in this table: SQL XSD datatype ___ ____________ INT http://www.w3.org/TR/xmlschema-2/#integer FLOAT http://www.w3.org/TR/xmlschema-2/#float DATE http://www.w3.org/TR/xmlschema-2/#date TIME http://www.w3.org/TR/xmlschema-2/#time TIMESTAMP http://www.w3.org/TR/xmlschema-2/#dateTime CHAR plain literal VARCHAR plain literal STRING plain literal The Direct Mapping is defined by a set of mapping functions from table rows to RDF triples: dfn direct mapping: the set of RDF triples produced by invoking the <table mapping> on each table in a database. dfn table mapping: the set of RDF triples created by invoking the <row mapping> on each row in a table. dfn row mapping: using a row identifier S for the row, the type triple: (S, rdf:type, <table type>) plus the scalar triples: for each attribute in the list of <scalars> where the attribute value is non-NULL: (S, the <property IRI> for the attribute, the <literal map> for the attribute value). plus the reference triples: for each list of attributes in the <non-unary references> where none of the attribute values are NULL: (S, the <property IRI> for the attributes, the <row identifier> for the referenced triple) ]] > Juan Sequeda > +1-575-SEQ-UEDA > www.juansequeda.com > > > On Tue, Aug 2, 2011 at 6:44 AM, Juan Sequeda <juanfederico@gmail.com> wrote: > > > Eric, > > > > This is great. I was planning to write up a proposal myself, but you saved > > my time. I do have some comments and suggestions. I'm writing up a new > > proposal based on what you have. I should have it done before the meeting > > > > Juan Sequeda > > www.juansequeda.com > > > > On Aug 2, 2011, at 2:01 AM, Michael Hausenblas < > > michael.hausenblas@deri.org> wrote: > > > > > > > > Eric, > > > > > >> PROPOSAL: that the English definition of the direct mapping be defined > > as: > > >> [[ > > >> The Direct Mapping is a formula for creating an RDF graph from the > > >> rows in a table. A base IRI defines a web space for the labels in > > > > > > ... > > > > > > Thanks a lot for this proposal, Eric! I'm wondering if we're ready to > > resolve this today or if the WG feels that we need to discuss a bit more. In > > any case I'm flexible to change today's agenda [1] if the WG thinks it makes > > sense ... > > > > > > Cheers, > > > Michael > > > > > > [1] > > http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Jul/0183.html > > > -- > > > Dr. Michael Hausenblas, Research Fellow > > > LiDRC - Linked Data Research Centre > > > DERI - Digital Enterprise Research Institute > > > NUIG - National University of Ireland, Galway > > > Ireland, Europe > > > Tel. +353 91 495730 > > > http://linkeddata.deri.ie/ > > > http://sw-app.org/about.html > > > > > > On 2 Aug 2011, at 00:22, Eric Prud'hommeaux wrote: > > > > > >> * Richard Cyganiak <richard@cyganiak.de> [2011-07-26 19:41+0100] > > >>> Hi all, > > >>> > > >>> The Direct Mapping document is stuck because we have a stalemate > > between the editors. With Last Call approaching, we need *some* way of > > breaking the stalemate. So here's a proposal. This is a possible new outline > > for the document, along with assignments of separate sections to separate > > editors. > > >>> > > >>> > > >>> 1. Introduction > > >>> - What is this? > > >>> - How does it relate to R2RML > > >>> - Target audience, assumed level of knowledge > > >>> - RDF terms and SQL/relational terms are used as defined in > > >>> documents XXX and YYY > > >>> > > >>> 2. Example (Informative) > > >>> - A simple two-table example > > >>> - Quick explanation of foreign key handling > > >>> - Quick explanation of tables w/o PKs > > >>> > > >>> 3. The Direct Mapping [in Plain English] > > >>> - “The Direct Graph of a database is the union of the Table Graphs > > >>> of all tables in the database.” > > >>> - “The Table Graph of a table is the union of the Row Graphs...” > > >>> - “The Row Graph of a row is ...” > > >>> - ... > > >> > > >> PROPOSAL: that the English definition of the direct mapping be defined > > as: > > >> [[ > > >> The Direct Mapping is a formula for creating an RDF graph from the > > >> rows in a table. A base IRI defines a web space for the labels in > > >> this graph; all labels are generated by appending to the base. > > >> > > >> The functions scalar and reference extract the scalar and reference > > >> attributes (those participating in a foreign key) respectively: > > >> > > >> dfn references: the attributes in a table's foreign keys. > > >> > > >> dfn scalars: the attributes in a table which are NOT in any foreign > > >> key. > > >> > > >> dfn: non-unary references: the references for which the table's > > >> foreign key is NOT composed of a single attribute. > > >> > > >> SQL table and attribute identifiers compose RDF IRIs in the direct > > >> graph. These identifiers are separated by the punctuation characters > > >> '#', ',', '/' and '='. All SQL identifiers are escaped following URL- > > >> encoding > > >> < > > http://www.w3.org/TR/html5/association-of-controls-and-forms.html#url-encoded-form-data > > > > > >> except that only the above punctuation and the characters not > > >> permitted in RDF IRIs are escaped. > > >> > > >> In the direct graph, there is an identifier for each row in a database > > >> table. If the row is in a table with a primary key, this is formed > > >> from the table name and the attribute names and values of each attribute > > >> in the primary key. If there is no primary key for the table, the row > > >> identifier is a fresh blank node: > > >> > > >> dfn row identifier: > > >> > > >> if the table has a primary key with attributes, the relative IRI for > > >> the row identifier is the concationation of the table name, '/', and > > >> a ','-separated concatonation of each attribute name, '=', and the > > >> attribute value. > > >> > > >> if the table has no primary key, the row identifier is a fresh blank > > >> node. > > >> > > >> A (potentially unary) list of attribute names in a table form a > > >> property IRI: > > >> > > >> dfn property IRI: the concationation of the table name, '/', and a > > >> ','-separated concatonation of each attribute name, and a '#' at > > >> the end of the property IRI. > > >> > > >> The values in a row are mapped to RDF literals: > > >> > > >> dfn litaral map: a mapping from an SQL value with a datatype to an RDF > > >> literal with and XML Schema datatype where the RDF literal has a > > >> lexical value equivalent to the SQL lexical value and the datatype > > >> mapping is found in this table: > > >> > > >> SQL XSD datatype > > >> ___ ____________ > > >> INT http://www.w3.org/TR/xmlschema-2/#integer > > >> FLOAT http://www.w3.org/TR/xmlschema-2/#float > > >> DATE http://www.w3.org/TR/xmlschema-2/#date > > >> TIME http://www.w3.org/TR/xmlschema-2/#time > > >> TIMESTAMP http://www.w3.org/TR/xmlschema-2/#dateTime > > >> CHAR plain literal > > >> VARCHAR plain literal > > >> STRING plain literal > > >> > > >> The Direct Maping is defined by a set of mapping functions from table > > >> rows to RDF triples: > > >> > > >> dfn direct mapping: the set of triples produced by invoking the > > >> <table mapping> on each table in a database. > > >> > > >> dfn table mapping: the set of RDF triples created by invoking the > > >> <row mapping> on each row in a table. > > >> > > >> dfn row mapping: using a row identifier S for the row, > > >> the type triple: > > >> (S, rdf:type, <table type>) > > >> plus the scalar triples: > > >> for each attribute in the list of <scalars> where the attribute > > >> value is non-NULL: > > >> (S, > > >> the <property IRI> for the attribute, > > >> the <literal map> for the attribute value). > > >> plus the reference triples: > > >> for each list of attributes in the <non-unary references> where none > > >> of the attribute values are NULL: > > >> (S, > > >> the <property IRI> for the attributes, > > >> the <row identifier> for the referenced triple) > > >> ]] > > >> > > >>> A. Appendix: Formalisms (Informative) > > >>> - should be crisp, short, precise, with only minimum explanation > > >>> and examples > > >>> A.1 Datalog Rules > > >>> A.2 Denotational Semantics > > >>> A.3 Set-Style Direct Mapping > > >>> > > >>> B. Acknowledgements (Informative) > > >>> > > >>> C. References > > >>> > > >>> > > >>> I see Juan and Marcelo editing A.1. > > >>> > > >>> I see Alexandre editing A.2. > > >>> > > >>> I see Eric editing 2 (which he already wrote), 3 (which *mostly* > > exists), and A.3. > > >>> > > >>> I don't know about 1, B, and C. > > >>> > > >>> My reasoning is that there is no objective way of picking any of the > > formalisms over another formalism, so the normative expression should be the > > lowest common denominator: plain English. By making the formalisms all > > informative, we free them from the burden of having to explain the direct > > mapping itself in a generally accessible way. The focus can be totally on > > presenting the formalisms in all their terseness to an audience that is > > familiar with datalog/denotational semantics/whatever. > > >>> > > >>> I hope this proposal aids discussion. > > >>> > > >>> Best, > > >>> Richard > > >> > > >> -- > > >> -ericP > > >> > > > > > > > > -- -ericP
Received on Tuesday, 2 August 2011 14:19:53 UTC