ISSUE-34, tableName

All,

Continuing the discussion from the call.

First, please review what the spec currently says:
http://www.w3.org/2001/sw/rdb2rdf/r2rml/#physical-tables

I'll explain the proposal in a bit more detail, for the sake of clarity. The proposal is to drop rr:tableOwner, and instead state that a "table or view name" MAY be qualified to include a schema name and a catalog name.

The intention is that all of the following are all valid values for rr:tableName:

   emp
   EMP
   SCOTT.EMP
   CAT.SCOTT.EMP
   "Emp"
   "Scott Smith".EMP
   SCOTT."!@#$%^&"

If catalog or schema are absent, then the defaults of the SQL connection are used. (Note that's nothing new; rr:tableOwner was always optional.)

My arguments against other proposed designs are as follows:

Proposal: no change
Objection: rr:tableOwner is nonstandard terminology and the standard term is schema.

Proposal: just rename rr:tableOwner to rr:tableSchema
Objection: A fully qualified table name is [[Catalog.]Schema.]Table, and there is no case for separating [Catalog.]Schema from Table

Proposal: have three properties rr:tableName, rr:schemaName, rr:catalogName (or similar names)
Objection: This works and is consistent. But it puts undue burden on users who now have to use three properties for what can be achieved with one. Users already can write fully qualified table names in catalog.schema.table notation in rr:sqlQuery, so why should rr:tableName work any differently?

Proposal: add rr:fullyQualifiedTableName as an alternative to the current design
Objection: Having two different ways of doing exactly the same thing creates extra cognitive load for mapping authors, extra work for implementers, and extra work for editors, with no benefit.

Proposal: allow fully qualified names, but call the property rr:objectName or similar to make clear it's something different from an unqualified table name
Objection: The SQL spec states that a table name may include schema and catalog, so calling it rr:tableName is consistent.

Objection: tableName is not intuitive because it doesn't mention views
Counter: The SQL spec says that views are identified by table names. One can use a view almost everywhere in SQL where a table is allowed; and this fact is usually simply assumed and not explicitly pointed out by always referring to "table or view". The R2RML spec very explicitly states, in multiple places, that views are allowed too.

Objection: Parsing fully qualified table names places undue burden on implementers
Counter: It's not hard.
Regex for an undelimited identifier: [^.]+
Regex for a delimited identifier: ("[^"]*")+
Regex for an identifier: (delimited|undelimited)
Regex for a qualified table name: ((identifier\.)?identifier\.)?identifier
(This regex is just for matching and splitting the three parts; it doesn't validate the characters that are allowed in undelimited identifiers. That's orthogonal to ISSUE-34.)

So my proposal is still:

[[
PROPOSAL: To resolve ISSUE-34, drop rr:tableOwner, and instead state that rr:tableName MAY be qualified to include a schema name and a catalog name
]]

Received on Tuesday, 12 July 2011 18:27:32 UTC