Re: Proposal for the Direct Mapping

Eric,

> PROPOSAL: that the English definition of the direct mapping be  
> defined as:
> [[
> The Direct Mapping is a formula for creating an RDF graph from the
> rows in a table. A base IRI defines a web space for the labels in

...

Thanks a lot for this proposal, Eric! I'm wondering if we're ready to  
resolve this today or if the WG feels that we need to discuss a bit  
more. In any case I'm flexible to change today's agenda [1] if the WG  
thinks it makes sense ...

Cheers,
	Michael

[1] http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Jul/0183.html
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

On 2 Aug 2011, at 00:22, Eric Prud'hommeaux wrote:

> * Richard Cyganiak <richard@cyganiak.de> [2011-07-26 19:41+0100]
>> Hi all,
>>
>> The Direct Mapping document is stuck because we have a stalemate  
>> between the editors. With Last Call approaching, we need *some* way  
>> of breaking the stalemate. So here's a proposal. This is a possible  
>> new outline for the document, along with assignments of separate  
>> sections to separate editors.
>>
>>
>>    1. Introduction
>>       - What is this?
>>       - How does it relate to R2RML
>>       - Target audience, assumed level of knowledge
>>       - RDF terms and SQL/relational terms are used as defined in
>>         documents XXX and YYY
>>
>>    2. Example (Informative)
>>       - A simple two-table example
>>       - Quick explanation of foreign key handling
>>       - Quick explanation of tables w/o PKs
>>
>>    3. The Direct Mapping [in Plain English]
>>       - “The Direct Graph of a database is the union of the Table  
>> Graphs
>>          of all tables in the database.”
>>       - “The Table Graph of a table is the union of the Row  
>> Graphs...”
>>       - “The Row Graph of a row is ...”
>>       - ...
>
> PROPOSAL: that the English definition of the direct mapping be  
> defined as:
> [[
> The Direct Mapping is a formula for creating an RDF graph from the
> rows in a table. A base IRI defines a web space for the labels in
> this graph; all labels are generated by appending to the base.
>
> The functions scalar and reference extract the scalar and reference
> attributes (those participating in a foreign key) respectively:
>
> dfn references: the attributes in a table's foreign keys.
>
> dfn scalars: the attributes in a table which are NOT in any foreign
>   key.
>
> dfn: non-unary references: the references for which the table's
>   foreign key is NOT composed of a single attribute.
>
> SQL table and attribute identifiers compose RDF IRIs in the direct
> graph. These identifiers are separated by the punctuation characters
> '#', ',', '/' and '='. All SQL identifiers are escaped following URL-
> encoding
> <http://www.w3.org/TR/html5/association-of-controls-and-forms.html#url-encoded-form-data 
> >
> except that only the above punctuation and the characters not
> permitted in RDF IRIs are escaped.
>
> In the direct graph, there is an identifier for each row in a database
> table. If the row is in a table with a primary key, this is formed
> from the table name and the attribute names and values of each  
> attribute
> in the primary key. If there is no primary key for the table, the row
> identifier is a fresh blank node:
>
> dfn row identifier:
>
>   if the table has a primary key with attributes, the relative IRI for
>   the row identifier is the concationation of the table name, '/', and
>   a ','-separated concatonation of each attribute name, '=', and the
>   attribute value.
>
>   if the table has no primary key, the row identifier is a fresh blank
>   node.
>
> A (potentially unary) list of attribute names in a table form a
> property IRI:
>
> dfn property IRI: the concationation of the table name, '/', and a
>   ','-separated concatonation of each attribute name, and a '#' at
>   the end of the property IRI.
>
> The values in a row are mapped to RDF literals:
>
> dfn litaral map: a mapping from an SQL value with a datatype to an RDF
>   literal with and XML Schema datatype where the RDF literal has a
>   lexical value equivalent to the SQL lexical value and the datatype
>   mapping is found in this table:
>
> SQL  	XSD datatype
> ___     ____________
> INT 	http://www.w3.org/TR/xmlschema-2/#integer
> FLOAT 	http://www.w3.org/TR/xmlschema-2/#float
> DATE 	http://www.w3.org/TR/xmlschema-2/#date
> TIME 	http://www.w3.org/TR/xmlschema-2/#time
> TIMESTAMP 	http://www.w3.org/TR/xmlschema-2/#dateTime
> CHAR 	plain literal
> VARCHAR plain literal
> STRING 	plain literal
>
> The Direct Maping is defined by a set of mapping functions from table
> rows to RDF triples:
>
> dfn direct mapping: the set of triples produced by invoking the
>   <table mapping> on each table in a database.
>
> dfn table mapping: the set of RDF triples created by invoking the
>   <row mapping> on each row in a table.
>
> dfn row mapping: using a row identifier S for the row,
>  the type triple:
>    (S, rdf:type, <table type>)
>  plus the scalar triples:
>    for each attribute in the list of <scalars> where the attribute
>      value is non-NULL:
>      (S,
>       the <property IRI> for the attribute,
>       the <literal map> for the attribute value).
>  plus the reference triples:
>    for each list of attributes in the <non-unary references> where  
> none
>      of the attribute values are NULL:
>      (S,
>       the <property IRI> for the attributes,
>       the <row identifier> for the referenced triple)
> ]]
>
>>    A. Appendix: Formalisms (Informative)
>>       - should be crisp, short, precise, with only minimum  
>> explanation
>>         and examples
>>       A.1 Datalog Rules
>>       A.2 Denotational Semantics
>>       A.3 Set-Style Direct Mapping
>>
>>    B. Acknowledgements (Informative)
>>
>>    C. References
>>
>>
>> I see Juan and Marcelo editing A.1.
>>
>> I see Alexandre editing A.2.
>>
>> I see Eric editing 2 (which he already wrote), 3 (which *mostly*  
>> exists), and A.3.
>>
>> I don't know about 1, B, and C.
>>
>> My reasoning is that there is no objective way of picking any of  
>> the formalisms over another formalism, so the normative expression  
>> should be the lowest common denominator: plain English. By making  
>> the formalisms all informative, we free them from the burden of  
>> having to explain the direct mapping itself in a generally  
>> accessible way. The focus can be totally on presenting the  
>> formalisms in all their terseness to an audience that is familiar  
>> with datalog/denotational semantics/whatever.
>>
>> I hope this proposal aids discussion.
>>
>> Best,
>> Richard
>
> -- 
> -ericP
>

Received on Tuesday, 2 August 2011 07:01:34 UTC