Re: ISSUE-69 (datatype sizes): datatype sizes

Hi Eric,

(Cc'ing the comments list; please consider this a LC comment, and an announcement of my intent to formally object to http://www.w3.org/TR/rdb-direct-mapping/#defn-literal_map if it isn't changed to something reasonable.)

Summary: As a matter of principle, I believe that user concerns should always override abstract concerns such as “conformance boundaries” or “interop spaces”. A mapping that fulfils user requirements is useful regardless of whether it's a standard or not. A mapping that fails to fulfil user requirements – such as one that arbitrarily truncates literal values – *will* fail as a standard because conforming implementations will be – by definition – buggy.

On 2 Oct 2011, at 17:20, Eric Prud'hommeaux wrote:
> In case you don't understand my intent, I am not trying to force implementers to discard information

But that is what the spec text you wrote requires.

> For example, if I X is an implementation of the Direct Mapping, can I expect to a query to return < 0001-01-01T00:00?

SQL 2008 specifies YEAR as a non-negative integer.

> Having no conformance boundry means that users can't expect anything more than 1 byte integers,

That's nonsense. The wording in R2RML precisely defines the translation of integers with more than 1 byte.

> implementors can't stamp their products

That's nonsense. The wording in R2RML precisely states what is required in order to call an implementation conforming.

> and we have no way to write test cases.

That's nonsense. The wording in R2RML contains testable assertions that absolutely can be written up as test cases.

> The behavior you're trying to encourage is that people don't stop at the minimal implementation, that they not exclude data from the graph simply because it is outside of the interop space. I agree with this goal, but want to state it in a way that doesn't erode the utility of having a standard.

I'm sorry but you have this *very* backwards.

Let's forget about standards for a minute and just talk about mappings from relational databases to RDF.

If there's an 38-digit DECIMAL column in a database (nothing uncommon), and the target RDF implementation supports 38-digit xsd:decimals (nothing uncommon), then the obvious Right Thing to do is to preserve all 38 digits. There can be absolutely no doubt about that. Everything else would be considered a bug or limitation of the implementation.

Your point of view is, paraphrased:

1) that preserving all 38 digits might be difficult for some vendors, and therefore should not be required of all vendors in order to be able to claim conformance;

2) that therefore, vendors who want to claim conformance must discard anything beyond the minimum required by the various relevant specs (in this case, xsd:decimal's 18 digits)

I disagree with the first point. The standard should ask vendors to do the Right Thing, even if it's difficult. “Conformance” is generally something that implementations should *aspire* to; whether they reach it, depends not just on their general willingness to support standards, but also on business decisions such as user demand for advanced features and other questions of resource availability. It is common and not a problem to have well-documented limitations and caveats in one's implementation of a standard.

However, the second point is completely and utterly unacceptable. The standard cannot enshrine behaviour that would generally be seen as a bug. Any potential failure of interoperability is irrelevant compared to this failure of delivering a reasonable and correct mapping in the first place!

The truncations you describe are also pointless from a technical point of view. Every database under the sun supports at least 255 characters in a VARCHAR, while 80-something digits in a DECIMAL are quite sufficient to express the number of atoms in the universe. Why truncate to 18???

> Are implementations *required* to interrogate the datebase name and version in order to trigger vendor-specific behavior?

Yes, if they want to be fully conforming. Every database-to-RDF mapper that works on more than one database engine is doing that already because every vendor implements SQL slightly differently.

> As a user, how much can I expect of the implementer?

I don't understand that question. A user can expect an implementation to do whatever the implementer says it does. You get what you paid for.

> As an implementer, when can I say I implement the DM.

You have implemented the DM when your implementation does everything that's required by the spec?

Best,
Richard

Received on Tuesday, 4 October 2011 13:05:25 UTC