Re: A case for a datatype for identifying RDF graphs

Bonjour PAC,


Le 19/05/2021 à 18:53, Pierre-Antoine Champin a écrit :
> Hi Antoine,
> 
> I have been meaning to reply your mail for some time. Sorry for the delay.

Idem.

> 
> [...]
> 
>>
>>
>> Long version:
>> ============
>>
>> Datatypes in RDF are used in order to refer to specific values that 
>> can be processed adequately by computer systems, as opposed to 
>> entities from the real world that can merely be referred to in such 
>> systems without access to the real thing. For instance, if one wants 
>> to refer to the integer "one", it is much better to use 
>> "1"^^xsd:integer, that any system understands as the number "one" -- 
>> less than "2"^^xsd:integer and more than "0"^^xsd:integer -- rather 
>> than using an identifier like 
>> http://km.aifb.kit.edu/projects/numbers/n1 that may refer to a real 
>> world entity to which the system has no understanding of.
>>
>> If one wants to refer to a *specific* instant in time, it is 
>> preferable to identify it using xsd:dateTime rather than minting a URI 
>> for the instant and relating it to its year, month, day, etc.
>>
>> If one wants to refer to a *specific* geometry on the Earth, it is 
>> better to use a geo:wktLiteral rather than introducing a URI and a 
>> vocabulary that would partially, and complexly define the geometry.
> 
> Theoretically, literals are cleaner, granted.

Theory should tell you what happens in practice. If practice does not 
match theory, then you have a bad theory. "In theory, if you use this 
amount of fuel with a rocket of this shape, launch it with this angle at 
this time of the year from this place, you should be able to go to the 
moon. But in practice, everyone knows that it is impossible to go to the 
moon."

> 
> In practice, I see this more as a trade-off, and the best answer depends 
> on your use-cases. For dates, it is sometimes useful to be able to 
> access their "parts" by simply following an arc.

I don't agree that it is simpler to locate a node that stands for a 
date, find the relevant arcs and get a value for a component of a date. 
It's much easier to have a node that's self-described as a xsd:date or 
xsd:dateTime and use standard datetime-functions to extract components.

The "follow the arc" solution requires graph matching, which is costly.

Nodes and arcs to describe a date are useful if you are uncertain about 
some components of a date. For instance, birthdates of historical 
figures of the middle ages are often unknown but identifiable within a 
certain range.

> (this is where N3 shines, by the way, but that's another topic)
> 
>>
>> If one wants to refer to a specific RDF graph, that could be processed 
>> based on the datatype that identifies it as such, it would be more 
>> efficient and robust to use a literal rather than a URI that 
>> supposedly identifies it.
>>
>> I see many discussions that attempts to address the problem of 
>> identifying specific RDF graphs (or specific triples) using extensions 
>> of RDF, such as named graphs, or RDF-star, when all we need is a 
>> datatype that is fully inside the scope of RDF syntax and semantics.
> 
> Again, theoretically you are right; however... the specifics of that 
> datatype would still have to be built into RDF implementations...

Again, there is no theory vs practice. You take the specification of a 
graph-literal datatype, implement it within your RDF API, and then you 
have a way to refer to RDF graphs, *in practice*. Such practical 
implementation is conforming to the RDF specification as is.


> RDF datasets, on the other hand, are already largely implemented. (and 
> so is RDF-star, to some extent :->)

RDF datasets should not be seen as a conflicting alternative to the 
graph-literal datatype. They are compatible and complementary.


--AZ

>>[...]

-- 
Antoine Zimmermann
École des Mines de Saint-Étienne
158 cours Fauriel
CS 62362
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
http://www.emse.fr/~zimmermann/

Received on Monday, 31 May 2021 07:55:49 UTC