W3C home > Mailing lists > Public > semantic-web@w3.org > July 2020

Re: Blank nodes must DIE! [ was Re: Blank nodes semantics - existential variables?]

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Tue, 21 Jul 2020 14:35:02 +0200
To: semantic-web@w3.org
Cc: Maxime Lefrançois <maxime.lefrancois@emse.fr>
Message-ID: <729fd5cd-6a0b-b907-942e-e501f92956e0@emse.fr>
Regarding physical quantities, such as "5 inches", etc., my colleague 
Maxime Lefrançois and myself coauthored a specification for a datatype 
for physical quantities [1]. It is quite simple: we reuse the Unified 
Code for Units of Measurement (UCUM), a standard that is used in many 
scientific applications, and combine it with a number:

<QUANTITY> ::= <NUMBER> <SPACES> <UCUMCODE>
<NUMBER> ::= xsd:decimal(('e'|'E')xsd:integer)?

Since UCUM has a well defined semantics, so does our datatype. Better, 
since UCUM is implemented in many programming languages, my colleague 
Maxime could easily integrate it into Jena and its SPARQL engine [2].

So, with our Jena fork, one can write:

SELECT ?planet WHERE {
   ?planet a ex:Planet;
     ex:diameter ?s .
   FILTER(?s > "2e11 mm"^^cdt:ucum)
}

This works if the size of the planet is encoded as a cdt:ucum, no matter 
what unit one is using. One can even use "link for Gunter's chain" (unit 
"[lk_us]"), or "cubic meters per acre" (unit "m3/[acr_us]") [3], which 
are both units of length.

With some of our industrial partners, we are using this for energy data, 
and they seem to be very pleased with this approach, compared to an 
ontology-based approach.


[1] https://w3id.org/lindt/custom_datatypes#ucum
[2] You can try it at https://ci.mines-stetienne.fr/lindt/playground.html
[3] Try this query in the playground:

"""
PREFIX iter: <http://w3id.org/sparql-generate/iter/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX cdt: <http://w3id.org/lindt/custom_datatypes#>
PREFIX ex: <http://example.org/>

SELECT ?length ?normalized

WHERE{

   VALUES ?position { "2.7e3 m3/[acr_us]"^^cdt:ucum }
   # convert to meters
   BIND("0 m"^^cdt:ucum + ?position AS ?normalized )

}
"""

--AZ

Le 17/07/2020 à 01:57, Cox, Simon (L&W, Clayton) a écrit :
> Yeah, the atomicity of the chunk is the point. This even applies to 
> quantities. 25.4mm is *identical* to 1” – they are the same thing. Any 
> engine that operates with quantities needs to understand that. ’25.4’ 
> and ‘mm’ cannot be separated. Coordinates are slightly more complex but 
> it comes down to the same thing. A single element within a set of 
> coordinates that describes a position in space is not independent of the 
> other numbers in the tuple, or of the coordinate reference system within 
> which they are expressed. One value should *never* be used independent 
> of the others. Exactly the same position on the earth will be denoted by 
> three different numbers if embedded in a different coordinate reference 
> system. You can only ‘reason’ over them as a group, not individually.
> 
> *From:*Dan Brickley <danbri@danbri.org>
> *Sent:* Thursday, 16 July, 2020 23:58
> *To:* Jeen Broekstra <jeen@fastmail.com>
> *Cc:* Semantic Web <semantic-web@w3.org>
> *Subject:* Re: Blank nodes must DIE! [ was Re: Blank nodes semantics - 
> existential variables?]
> 
> …
> 
> I believe the big appeal of putting it all into the zone we call 
> "literals" is that you get a kind of atomicity; that chunk of data is 
> either there, or not there; it is asserted, or not asserted. With a 
> triples-based (description of a ) data structure you have to be 
> constantly on your guard that every subset of the full graph pattern is 
> at least sensible and harmless, even when subsetting these chunks is 
> often confusing or misleading for data consumers. I can't help wondering 
> whether notions of graph shapes from shacl, shex (and sparql) could be 
> exploited to create an RDF-based data format which had atomicity at the 
> level of entire shapes.
> 
> Dan
> 
>     Jeen
> 

-- 
Antoine Zimmermann
Institut Henri Fayol
École des Mines de Saint-Étienne
158 cours Fauriel
CS 62362
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://www.emse.fr/~zimmermann/
Member of team Connected Intelligence, Laboratoire Hubert Curien
Received on Tuesday, 21 July 2020 12:35:23 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 21 July 2020 12:35:24 UTC