- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 30 Oct 2011 19:11:40 -0400
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: public-rdb2rdf-comments@w3.org
* Richard Cyganiak <richard@cyganiak.de> [2011-10-27 11:27+0100] > This is a Last Call comment on the Direct Mapping and R2RML specifications: > > http://www.w3.org/TR/2011/WD-rdb-direct-mapping-20110324/ > http://www.w3.org/TR/2011/WD-r2rml-20110920/ > > > Both specifications define a mapping from SQL datatyped values to RDF literals. > > http://www.w3.org/TR/2011/WD-rdb-direct-mapping-20110920/#defn-literal_map > http://www.w3.org/TR/2011/WD-r2rml-20110920/#datatype-conversions > > The two mappings differ in various details. Given that the requirements for both mappings are the same, this places undue burden on implementers that plan to implement both the Direct Mapping and R2RML. > > Therefore, both specifications should use the same mapping. > > I note that the mapping in R2RML is based on the SQL-to-XML mapping in ISO/IEC 9075-14:2008, and covers more of SQL 2008 than the mapping in the DM. > > I therefore propose that the mapping from the R2RML specification is used in both documents, with the DM specification using a normative reference to the R2RML spec. The current text in DM is: [[ The values in a row are mapped to RDF literals. The Direct Graph is defined for a set of typed values which are defined in minimally-conformant SQL processors and expressible in minimally-conformant XML Schema Datatypes implementations. The literal map provides mapping algorithms for these datatypes: ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │Definition literal map: a mapping from an SQL value with a datatype to: │ │ │ │ * for the SQL datatypes CHAR, VARCHAR and STRING, a Plain literal with the lexical value of the SQL value. │ │ * for the SQL datatypes listed in this table, a Typed literal with the this datatype and lexical form: │ │ │ │ SQL datatype RDF datatype Lexical form │ │ BINARY, BINARY VARYING, BINARY LARGE OBJECT xsd:base64Binary XML Schema base64 encoding of value │ │ NUMERIC, DECIMAL xsd:decimal SQL result of: CAST(value AS CHARACTER VARYING(18)) │ │ SMALLINT, INTEGER, BIGINT xsd:integer SQL result of: CAST(value AS CHARACTER VARYING(18)) │ │ FLOAT, REAL, DOUBLE PRECISION xsd:double SQL result of: CAST(value AS CHARACTER VARYING(23)) │ │ BOOLEAN xsd:boolean SQL result of: IF (value, 'true', 'false') │ │ DATE xsd:date SQL result of: CAST(value AS CHARACTER VARYING(13)) │ │ TIME xsd:time SQL result of: CAST(value AS CHARACTER VARYING(23)) │ │ TIMESTAMP xsd:dateTime SQL result of: REPLACE(CAST(value AS CHARACTER VARYING(37)), " ", "T") │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ Extensions to the Direct Mapping should note the spirit of this mapping, i.e. to use a valid representation of an XML Schema Datatype corresponding to the SQL datatype. For numerics, booleans and dates, the canonical XML Schema lexical representation is used. Extensions are likely to map data outside of the minimal SQL conformance into data types with higher precision than those specified by the literal map. ]] — http://www.w3.org/2001/sw/rdb2rdf/directMapping/LC/Overview.html#minimal-DG The DM and R2RML differ principally in that the DM asserts a finite (datatype) domain (18 digit integers, IEEE754 doubles, etc.) while R2RML leaves the domain dependent on the database being queried. A tool which uses e.g. floats or ints to manipulate the graph defined by R2RML would have to qualify its conformance by the version of the database to which it was connected (e.g. "offers R2RML for MySQL 5.01, but not Oracle 11G"). General compatibility with R2RML over any database can only be preserved if you don't use native types at any step of the e.g. query answering process. Applying the unbounded precision support to DM would mean that FeDeRate would no longer be an implementation (it uses Jena to parse and execute queries which I believe uses java native types) and SWObjects would have an even harder time as it is intended to connect multiple databases with potentially different maximum precisions. In <http://www.w3.org/mid/20111011150033.GA10078@w3.org>, I proposed to simplify this to use "the canonical XML Schema lexical representation of this domain:…". I also propose that we ditch the lexical recipes as pretty much no one will use them anyways, give that native drivers give either native datatype or lexical values for queried attributes: [[ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │Definition literal map: a mapping from an SQL value with a datatype to: │ │ │ │ * for the SQL datatypes CHAR, VARCHAR and STRING, a Plain literal with the lexical value of the SQL value. │ │ * for BINARY, BINARY VARYING, BINARY LARGE OBJECT, a xsd:base64Binary with the XML Schema base64 encoding of the SQL value. │ │ * for BOOLEAN, a xsd:boolean with a lexical value of 'true' or 'false'. │ │ * for the SQL datatypes listed in this table, a Typed literal with the this datatype and the canonical lexical form of: │ │ │ │ SQL datatype Value range RDF datatype │ │ NUMERIC, DECIMAL -10^18 to 10^18 xsd:decimal │ │ SMALLINT, INTEGER, BIGINT -10^18 to 10^18 xsd:integer │ │ FLOAT, REAL, DOUBLE PRECISION IEEE754 double xsd:double │ │ DATE 0001-01-01 to 9999-12-31 xsd:date │ │ TIME 00:00:00-14:00 to 23:59:59.9999+14:00 xsd:time │ │ TIMESTAMP 0001-01-01T00:00:00-14:00 to 9999-11-31T23:59:59.9999+14:00 xsd:dateTime │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ]] (BTW, we don't necessarily need to limit the early dates to 0001-01-01. While SQL doesn't demand a representation for the year 0, XML Schema does.) thoughts? > Best, > Richard -- -ericP
Received on Sunday, 30 October 2011 23:12:13 UTC