- From: Dan Brickley <danbri@danbri.org>
- Date: Thu, 23 Jul 2020 20:02:59 +0100
- To: Patrick J Hayes <phayes@ihmc.us>
- Cc: Antoine Zimmermann <antoine.zimmermann@emse.fr>, Semantic Web <semantic-web@w3.org>, Maxime Lefrançois <maxime.lefrancois@emse.fr>
- Message-ID: <CAFfrAFqgq7JxxwzEhYoMV70haRznXkjLBiOwhQUjwGJ0S0vsug@mail.gmail.com>
On Thu, 23 Jul 2020 at 19:50, Patrick J Hayes <phayes@ihmc.us> wrote: > Excellent. I have thought for some time that this way of using datatyping > would be the right way to go. Congratulations on having actually done it :-) > This is really interesting. Every couple of years I stumble across UCUM ( http://unitsofmeasure.org/trac -> http://unitsofmeasure.org/trac/wiki/TermsOfUse) before being scared away by the prickly terms of use document. It is not a document that seems to welcome re-use. Dan > Pat > > > On Jul 21, 2020, at 7:35 AM, Antoine Zimmermann < > antoine.zimmermann@emse.fr> wrote: > > > > Regarding physical quantities, such as "5 inches", etc., my colleague > Maxime Lefrançois and myself coauthored a specification for a datatype for > physical quantities [1]. It is quite simple: we reuse the Unified Code for > Units of Measurement (UCUM), a standard that is used in many scientific > applications, and combine it with a number: > > > > <QUANTITY> ::= <NUMBER> <SPACES> <UCUMCODE> > > <NUMBER> ::= xsd:decimal(('e'|'E')xsd:integer)? > > > > Since UCUM has a well defined semantics, so does our datatype. Better, > since UCUM is implemented in many programming languages, my colleague > Maxime could easily integrate it into Jena and its SPARQL engine [2]. > > > > So, with our Jena fork, one can write: > > > > SELECT ?planet WHERE { > > ?planet a ex:Planet; > > ex:diameter ?s . > > FILTER(?s > "2e11 mm"^^cdt:ucum) > > } > > > > This works if the size of the planet is encoded as a cdt:ucum, no matter > what unit one is using. One can even use "link for Gunter's chain" (unit > "[lk_us]"), or "cubic meters per acre" (unit "m3/[acr_us]") [3], which are > both units of length. > > > > With some of our industrial partners, we are using this for energy data, > and they seem to be very pleased with this approach, compared to an > ontology-based approach. > > > > > > [1] https://w3id.org/lindt/custom_datatypes#ucum > > [2] You can try it at > https://ci.mines-stetienne.fr/lindt/playground.html > > [3] Try this query in the playground: > > > > """ > > PREFIX iter: <http://w3id.org/sparql-generate/iter/> > > PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> > > PREFIX cdt: <http://w3id.org/lindt/custom_datatypes#> > > PREFIX ex: <http://example.org/> > > > > SELECT ?length ?normalized > > > > WHERE{ > > > > VALUES ?position { "2.7e3 m3/[acr_us]"^^cdt:ucum } > > # convert to meters > > BIND("0 m"^^cdt:ucum + ?position AS ?normalized ) > > > > } > > """ > > > > --AZ > > > > Le 17/07/2020 à 01:57, Cox, Simon (L&W, Clayton) a écrit : > >> Yeah, the atomicity of the chunk is the point. This even applies to > quantities. 25.4mm is *identical* to 1” – they are the same thing. Any > engine that operates with quantities needs to understand that. ’25.4’ and > ‘mm’ cannot be separated. Coordinates are slightly more complex but it > comes down to the same thing. A single element within a set of coordinates > that describes a position in space is not independent of the other numbers > in the tuple, or of the coordinate reference system within which they are > expressed. One value should *never* be used independent of the others. > Exactly the same position on the earth will be denoted by three different > numbers if embedded in a different coordinate reference system. You can > only ‘reason’ over them as a group, not individually. > >> *From:*Dan Brickley <danbri@danbri.org> > >> *Sent:* Thursday, 16 July, 2020 23:58 > >> *To:* Jeen Broekstra <jeen@fastmail.com> > >> *Cc:* Semantic Web <semantic-web@w3.org> > >> *Subject:* Re: Blank nodes must DIE! [ was Re: Blank nodes semantics - > existential variables?] > >> … > >> I believe the big appeal of putting it all into the zone we call > "literals" is that you get a kind of atomicity; that chunk of data is > either there, or not there; it is asserted, or not asserted. With a > triples-based (description of a ) data structure you have to be constantly > on your guard that every subset of the full graph pattern is at least > sensible and harmless, even when subsetting these chunks is often confusing > or misleading for data consumers. I can't help wondering whether notions of > graph shapes from shacl, shex (and sparql) could be exploited to create an > RDF-based data format which had atomicity at the level of entire shapes. > >> Dan > >> Jeen > > > > -- > > Antoine Zimmermann > > Institut Henri Fayol > > École des Mines de Saint-Étienne > > 158 cours Fauriel > > CS 62362 > > 42023 Saint-Étienne Cedex 2 > > France > > Tél:+33(0)4 77 42 66 03 > > Fax:+33(0)4 77 42 66 66 > > http://www.emse.fr/~zimmermann/ > > Member of team Connected Intelligence, Laboratoire Hubert Curien > > > > >
Received on Thursday, 23 July 2020 19:03:25 UTC