- From: Freddy Priyatna <freddy.priyatna@gmail.com>
- Date: Wed, 4 Aug 2010 13:45:48 +0200
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>, Souripriya Das <souripriya.das@oracle.com>
- Message-ID: <AANLkTimFEXtWhJ7zg1MpkPnHK_=KwKeTLi79apAnSFHT@mail.gmail.com>
Dear All, I would like to ask a question related to the point 2.5:Lookup table. Does this lookup table work with range? I am aware that this lookup table is supported in D2R, but it features only one-to-one mapping. So, I was not able to produce a delivery lookup table with the following entries: - entry less than 3 value="express delivery" - entry between 3 and 7 value="normal delivery" - entry more than 7 value="slow delivery" Maybe if this lookup-table is to be included in future proposal, this could be a nice feature. Thanks and Regards Freddy On Tue, Aug 3, 2010 at 11:48 PM, Richard Cyganiak <richard@cyganiak.de>wrote: > All, > > Here is a collection of comments on the mapping language proposal that > Souri presented in today's call. It's great that we now have a concrete > proposal on the table that we can iteratively refine. I refer to the XML > schema and example instances here: > > > http://www.w3.org/2001/sw/rdb2rdf/wiki/An_XML_Schema_for_SQL-based_RDB2RDF_Mapping > > http://www.w3.org/2001/sw/rdb2rdf/wiki/Revised_Example_of_SQL-Query_based_Approach > > I have also previously studied Souri's presentation from the F2F, which > presented a previous version of the same idea in an abstract syntax: > > > http://www.w3.org/2001/sw/rdb2rdf/wiki/File:RDB2RDF_Map_viaSQL_F2F_at_SemTech2010.pdf > > I grouped my comments into three categories: > > 1. Non-functional improvements, that is, issues where I believe that the > same effect could be achieved in a better way; > > 2. Proposals for adding expressivity, to make sure that the language covers > all of RDF and is suitable for some real-world mapping tasks that currently > are impossible or very hard to achieve; > > 3. Proposals related to authoring convenience, that is, cases where > maintainability and usability can be improved by adding some syntactic sugar > or other optional features. > > As a further comment, I don't believe that XML is the right choice for the > surface syntax of this language, and I think that the mapping itself should > be expressed as an RDF graph. But that is a separate discussion, and I will > for now assume that an RDF representation of the mapping itself could be > derived easily at a later time. > > Cheers, > Richard > > > > 1. Non-functional improvements > ============================== > > 1.1. Don't enclose generated URIs in pointy brackets > ---------------------------------------------------- > The proposal demands the generation of URIs enclosed in pointy brackets, > for example: > > SELECT '<xyz.com/emp/' || empno || '>' AS empURI > > This is unnecessary, the following works just as well: > > SELECT 'xyz.com/emp/' || empno AS empURI > > > 1.2. Name of InstURIPropertyMap > ------------------------------- > The element defines instance URIs, but has nothing to do with properties, > it should be renamed to reflect that. > > > 1.3. Issues with GraphURIPropertyMap > ------------------------------------ > It appears that GraphURIPropertyName is unnecessary; just providing a way > of generating a graph name URI is sufficient for placing triples into an > appropriate graph. What is the use of this property? > > To selectively place different properties of a ClassMap in different > graphs, the proposal allows referencing of the PropertyMaps from the > GraphURIPropertyMap by specifying the RDF property name used in the > PropertyMap. This only works if we assume that RDF properties are unique > within a ClassMap. That's an unnecessary restriction. A cleaner design would > be to invert the direction of the reference, pointing from the PropertyMap > to the GraphURIPropertyMap that specifies where the generated triple should > reside. > > > 1.4. Issues with KeyMaps > ------------------------ > It appears that the proposal requires specification of all key constraints > in the mapping language. Often this will duplicate information that can be > determined automatically from the database schema. This should be avoided if > at all possible. > > It appears that the proposal works by expressing constraints (PK, unique, > FK) on RDF properties. It would be better to express them as constraints on > the columns in the result sets generated by the views/queries. This is > because the constraint-carrying column may actually not be exposed directly > in the generated RDF. For example, the PK column of a table is often not > mapped to a property, but becomes part of the subject URI. Thus, in the emp > table, the PK constraint should not be expressed on the emp:empno property, > but on the empno (or empURI) column in the view. > > > 2. Adding expressivity > ====================== > > 2.1. Allow blank nodes as subjects > ---------------------------------- > The proposal only allows URIs in the subject positions of triples. RDF also > allows blank nodes. This could be addressed by allowing an InstBNodeMap with > one or more DB columns in place of the InstURIPropertyMap. > > > 2.2. Allow language tags for literals > ------------------------------------- > The proposal does not allow the specification of language tags on literals. > This could be addressed by adding an attribute to the PropertyMap. It is > worth exploring wether computed language tags, whose values come from a view > column rather than from a constant attribute value, should be supported as > well. > > > 2.3. Allow custom datatypes for literals > ---------------------------------------- > The proposal doesn't yet fully describe how datatypes would be handled, but > the intention seems to be that the datatype is determined by the expression > that defines the view column. There language should support overriding the > datatype with an arbitrary datatype URI. It would be the mapping author's > responsibility to ensure that the view column contains values that are legal > lexical forms for that datatype. The datatype could be specified in an > attribute of the PropertyMap. It is worth exploring wether computed > datatypes should be allowed. > > > 2.4. Computed property URIs > --------------------------- > It is worth exploring wether computed property URIs should be allowed. In > addition to the ProperyName attribute, which provides a constant property > URI, there could be a PropertyColumnName that refers to a view column which > contains computed property URIs. A use case is a table with an > entity-attribute-value schema, which is reasonably common in complex > database schemas. > > > 2.5. Lookup tables > ------------------ > The proposal doesn't support the use of lookup tables. Often, a database > column contains a short alphanumeric code whose meaning is described in some > documentation outside of the database. If this is to be mapped into “nice” > RDF, with proper meaningful URIs instead cryptic code, then a lookup table > has to be used. If that table is not part of the database, and we cannot > provide it as part of the database, then the lookup table has to be provided > as part of the mapping. A sketch: > > <LookupTable id="expense-types"> > <Entry code="04" value="http://example.org/expense-types/airfare" /> > <Entry code="05" value="http://example.org/expense-types/bus-train-ferry" > /> > <Entry code="24" value="http://example.org/expense-types/subsistence" /> > <Entry code="53" value="http://example.org/expense-types/hospitality" /> > </LookupTable> > > <ClassMap ...> > <SQLdefString>SELECT ... empno ...</SQLdefString> > </ClassMap> > > <PropertyMap DBColumnName="exptype" LookupTable="expense-types" > PropertyName="ex:expenseType" /> > > In theory, this can already been done by writing a lot of nested IFF > statements in the view definition, but it appears to be a reasonably common > case that warrants special syntax. > > > 3. Usability and authoring considerations > ========================================= > > 3.1. Support for URI templates > ------------------------------ > While it is possible to construct pretty much arbitrary URIs in SQL by > means of concatenation and string functions, SQL is not always the best > approach. There should be a way of constructing URIs from view columns > outside of the SQL query. Reasons include: i) The same URI-generating > expression is often needed multiple times throughout a mapping, which leads > to redundancy and imposes maintenance costs. ii) URI generation often > involves string escaping, case changes etc, which quickly leads to unwieldy > SQL expressions. iii) Standard SQL functions may not cover all needs for URL > character escaping. A proposal would be to use a syntax inspired by the URI > Templates draft [1] to declare URI patterns at the beginning of the file. > The patterns can then be referenced from throughout the file. This would be > optional; SQL expressions could still be used for URI construction. > > [1] http://www.ietf.org/id/draft-gregorio-uritemplate-04.txt > > Example: > > <URITemplate id="employee" value="http://example.com/emp/{empno}<http://example.com/emp/%7Bempno%7D>" > /> > > <ClassMap ...> > <SQLdefString>SELECT ... empno ...</SQLdefString> > <InstURIPropertyMap template="employee" /> > ... > </ClassMap> > > > 3.2. Multiple maps for a single class > ------------------------------------- > I believe that the proposal assumes that the value of @ClassName on the > ClassMap element is unique throughout In many schemas there might be > multiple tables whose records represent instances of the same domain > vocabulary class. For example, a mapping that generates contact information > from a university database might map records in the STAFF table and in the > STUDENT table to the same class foaf:Person. This *can* be achieved with a > single ClassMap, by forming a SQL UNION of the two tables after appropriate > transformations. But this can be awkward because the two tables could have > quite different columns, e.g., staff have OFFICE_NO and CONSULTATION_HOURS, > while students have MAJOR and FACEBOOK_ID. For authoring convenience it > should be allowed to specify two separate ClassMaps, rather than shoehorning > everything into a single view. (At that point, ClassMap should probably be > renamed to ViewMap or QueryMap or something similar, because it then > represents the mapping from a single SQL view/query, and no longer the > mapping to a single class.) > > I note that both 3.1 and 3.2 are helpful to support scenarios where an > ontologist prepares the mapping, and a DBA fills in the SQL queries. > > > 3.3. Referring to a table instead of a query/view > ------------------------------------------------- > To make the simple case simple, there should be a <tableName> element that > could be used in place of <SQLdefString>. The following markup > > <tableName>EMP</tableName> > > would be a shorthand for the following: > > <SQLdefString>SELECT * FROM EMP</SQLdefString> > > > 3.4. Allow constant values in place of column names > --------------------------------------------------- > There are reasonably frequent cases where some constant value is to be > added to all resources generated by a ClassMap. For example, in the case of > the emp table, a triple of the shape > > <http://xyz.com/emp/1234> emp:employeeOf <http://xyz.com/the_company> .. > > might be added to each employee resource. Note that only the subject of the > triple depends on the query result; predicate and object are the same for > all triples. This could be achieved by having > > SELECT ... '<http://xyz.com/the_company>' AS company > > in the query, along with a PropertyMap > > <PropertyMap PropertyName="emp:employeeOf" RDFTerm="IRI" > DBColumnName="company"/> > > but this seems awkward and inefficient; a better alternative relying on > syntactic sugar would be > > <PropertyMap PropertyName="emp:employeeOf" RDFTerm="IRI" > ConstantValue="http://xyz.com/the_company" /> > > and omit the company column from the query. >
Received on Wednesday, 4 August 2010 16:12:18 UTC