- From: Pierre-Antoine Champin <pierre-antoine@w3.org>
- Date: Sat, 21 Jan 2023 00:27:42 +0100
- To: Souripriya Das <souripriya.das@oracle.com>
- Cc: RDF-star WG <public-rdf-star-wg@w3.org>
- Message-ID: <ebdbdff5-42b5-f91c-fc8c-4ad24c570e1a@w3.org>
Dear Souri, I wanted to react to your presentation during the RDF-star call yesterday, especially about the "future proof modelling" argument. Consider the following two examples: Example 1: consider a relational data model with a table Person and a table Company. The table Person contains a column "woksFor", that is a foreign key to Company. At some point, we need to represent the fact that a given person works for two different companies at the same time. Currently, this requires changing the model (replacing the column Person.worksFor by a new table WorksFor with 2 foreign keys to Person and Company). Following your logic at the extreme, this would be an argument to extend the relational model to allow multiple values in a column, so that this use-case could be accommodated without changing the original model. This would make the relation model much more complex, and would probably not be worth it. Example 2: consider an RDF graph where a property :postalAddress has domain :Person and range xsd:string. This is all very well, until someone wants to describe addresses themselves (separate their different "fields", link them to an entity of type city rather than a city name, add geo-coordinates to an address...). This would require a change in the model, where :postalAdress now points to an IRI or blank node, which would carry original string in its rdf:value property, but could carry additional properties as well. Following your logic at the extreme, someone could argue in favor of allowing string literals in the subject position, so that they could add properties to the "address string" without changing the original model. This would be a very bad idea, because it would be conflating strings with the addresses that they represent. (NB: my point here is not to say that "literals as subjects" is a bad idea per se, but that this would be a bad solution to this particular problem) My point here is that remodeling can not always be avoided -- or that avoiding it would overly complicate the model (example 1), or lead to even worse modelling (example 2). So yes, we should strive to make the user's life easier. But we must keep in mind that this is a trade-off. The curse is sometimes worse than the disease. RDFn makes the inner model more complex (alla Example 1 above): - it adds a "4th column" to every triple. IIUC, you seem to assume that all implementations already deal with some for of triple identifier, all we need is to expose it to the user. But I am not sure that all implementations have such an internal identifier (I am actually pretty sure that some don't). - somehow, it turns graphs, that are currently sets of triples, into multisets of triples. And multisets are tricky. What happens for example when you merge two graphs containing an identical triple? Is it the same triple? Two triples with different "default" identifiers? What appens when you use SPARQL UPDATE to remove a triple? Do you remove only one of them or all of them? Can of worm ahead... pa
Attachments
- application/pgp-keys attachment: OpenPGP public key
Received on Friday, 20 January 2023 23:27:45 UTC