Re: A case for a datatype for identifying RDF graphs from Pierre-Antoine Champin on 2021-06-01 (semantic-web@w3.org from June 2021)

From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Date: Tue, 1 Jun 2021 17:46:48 +0200
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>, Semantic Web <semantic-web@w3.org>
Message-ID: <9bf95ec0-b543-5e4c-e320-00538df9a5e6@ercim.eu>
On 31/05/2021 09:55, Antoine Zimmermann wrote:

>
>> [...]
>>
>>>
>>>
>>> Long version:
>>> ============
>>>
>>> Datatypes in RDF are used in order to refer to specific values that 
>>> can be processed adequately by computer systems, as opposed to 
>>> entities from the real world that can merely be referred to in such 
>>> systems without access to the real thing. For instance, if one wants 
>>> to refer to the integer "one", it is much better to use 
>>> "1"^^xsd:integer, that any system understands as the number "one" -- 
>>> less than "2"^^xsd:integer and more than "0"^^xsd:integer -- rather 
>>> than using an identifier like 
>>> http://km.aifb.kit.edu/projects/numbers/n1 that may refer to a real 
>>> world entity to which the system has no understanding of.
>>>
>>> If one wants to refer to a *specific* instant in time, it is 
>>> preferable to identify it using xsd:dateTime rather than minting a 
>>> URI for the instant and relating it to its year, month, day, etc.
>>>
>>> If one wants to refer to a *specific* geometry on the Earth, it is 
>>> better to use a geo:wktLiteral rather than introducing a URI and a 
>>> vocabulary that would partially, and complexly define the geometry.
>>
>> Theoretically, literals are cleaner, granted.
>
> Theory should tell you what happens in practice.
I would rather say "what CAN happen" in practice. See more below
> If practice does not match theory, then you have a bad theory.
Agreed, of course.
> "In theory, if you use this amount of fuel with a rocket of this 
> shape, launch it with this angle at this time of the year from this 
> place, you should be able to go to the moon. But in practice, everyone 
> knows that it is impossible to go to the moon."
>
>>
>> In practice, I see this more as a trade-off, and the best answer 
>> depends on your use-cases. For dates, it is sometimes useful to be 
>> able to access their "parts" by simply following an arc.
>
> I don't agree that it is simpler to locate a node that stands for a 
> date, find the relevant arcs and get a value for a component of a 
> date. It's much easier to have a node that's self-described as a 
> xsd:date or xsd:dateTime and use standard datetime-functions to 
> extract components.
>
> The "follow the arc" solution requires graph matching, which is costly.
>
> Nodes and arcs to describe a date are useful if you are uncertain 
> about some components of a date. For instance, birthdates of 
> historical figures of the middle ages are often unknown but 
> identifiable within a certain range.

Note that I didn't write "simpler" or "cheaper", I wrote "sometimes 
useful" ;-) So we seem to agree after all.

Note also that as soon as *some* of your dates are incomplete, dealing 
with two different kinds of representations (literals for complete 
dates, blank nodes with properties for incomplete dates) may become 
complicated, and justify to use the 2nd kind even for complete dates.

But we are getting carried away from the original topic.

>
>> (this is where N3 shines, by the way, but that's another topic)
>>
>>>
>>> If one wants to refer to a specific RDF graph, that could be 
>>> processed based on the datatype that identifies it as such, it would 
>>> be more efficient and robust to use a literal rather than a URI that 
>>> supposedly identifies it.
>>>
>>> I see many discussions that attempts to address the problem of 
>>> identifying specific RDF graphs (or specific triples) using 
>>> extensions of RDF, such as named graphs, or RDF-star, when all we 
>>> need is a datatype that is fully inside the scope of RDF syntax and 
>>> semantics.
>>
>> Again, theoretically you are right; however... the specifics of that 
>> datatype would still have to be built into RDF implementations...
>
> Again, there is no theory vs practice. You take the specification of a 
> graph-literal datatype, implement it within your RDF API, and then you 
> have a way to refer to RDF graphs, *in practice*.

In practice, no such implementation exists (AFAIK). So this is mere 
theory for the moment.

By saying that, I do not mean to dismiss the proposal as being 
unrealistic or uninteresting -- again, I *like* the proposal, and I do 
think it is worth implementing it!

But in the meantime, I expect people to (ab)use other tools that 
immediately available (such as named graphs), even if not standard (such 
as RDF*), to solve the kind of problems that graph literals are meant to 
solve.

> Such practical implementation is conforming to the RDF specification 
> as is.
>
>
>> RDF datasets, on the other hand, are already largely implemented. 
>> (and so is RDF-star, to some extent :->)
>
> RDF datasets should not be seen as a conflicting alternative to the 
> graph-literal datatype. They are compatible and complementary.

I agree. I hope my response above clarifies what I meant here.

   pa

>
>
> --AZ
>
>>> [...]
>
Received on Tuesday, 1 June 2021 15:47:14 UTC