Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists] from Peter F. Patel-Schneider on 2022-10-01 (semantic-web@w3.org from October 2022)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Sat, 1 Oct 2022 09:52:06 -0400
To: Martynas Jusevičius <martynas@atomgraph.com>, semantic-web@w3.org
Message-ID: <e6f6ae00-6f42-2616-7efb-3770eed29dcb@gmail.com>
On 10/1/22 08:57, Martynas Jusevičius wrote:

> On Sat, Oct 1, 2022 at 2:28 PM Nicolas Chauvat
> <nicolas.chauvat@logilab.fr> wrote:
>
>> Would you treat my other example, the complex number, in the same way and use
>> two predicates realPart and imaginaryPart ?
>>
>> What would you think of something like this ?
>>
>>    :me :positionInPlan9 (2, -3)^:complexPosition
>>    :marty :driveTo ("1985-10-26", "09:00", "EST")^:timeDateTZ
>>
> :me :positionInPlan9 [ a :complexPosition; :first 2, :second -3 ].
> :marty :driveTo [ :date "1985-10-26", :time "09:00", :tz "EST" ] .
>
> Same thing, and no need for new standards.



Along these lines, I wrote a long message describing several ways of 
representing information in RDF graphs.  This response nicely illustrates one 
of the points I am trying to make so I'm using it as, perhaps, a TL;DR for my 
message.



RDF graphs have two ways of representing information about the world.

They have literals, whose meaning comes from a formal or informal description 
their datatype IRI.  Some datatypes have their descriptions determined from 
information in the RDF recommendations.  The descriptions of other datatypes 
are determined by means outside the RDF recommendations.

They have triples, whose meaning comes from the RDF recommendations.  The 
intended meaning of triples is derived from the intended meaning of their 
subjects, predicates, and objects. For non-literal subjects, predicates, and 
objects this intended meaning is left unspecified.

For important IRI nodes it is good practice to provide documents that can be 
accessed using the IRI that provide the intended meaning of the node including 
both RDF triples that provide part of the intended meaning and text that 
described the intended meaning.  (These documents can often be considered to 
define the IRI but they really don't, as abiding by the information in the 
document is not mandatory.)  This can be done for all IRI nodes, including 
datatype IRIs.

In this way systems that use RDF can take an RDF graph and, if they choose, 
merge other triples into the graph to include part of the intended meaning of 
IRI nodes in the graph.  Humans can use these documents to help them 
understand the intended meaning of IRI nodes.


This leads to two different ways to represent entities in the world and make 
them relatively easy for others to use.

One can use a datatype for the entities, say chess:position, ideally creating 
a document containing information about chess positions represented at 
chess:position literals and making it available at chess:position; and create 
RDF graphs containing literals like "e5"^^chess:position.  The document will 
be rather sparse as far as RDF content does, probably only containing one triple
  chess:position rdf:type rdfs:Datatype
but including text that describes this datatype, for example stating the 
permissable literal values.

One can use a class for the entities, say chess:Position, ideally creating a 
document at chess:Position that provides information about chess positions 
represented as instances of chess:Position and making this document available 
at chess:Position; and create RDF graphs stating membership in 
chess:Position.  The document will likely have more than one triple, probably 
containing at least

  chess:Position rdf:type rdfs:Class .
  chess:rank rdfs:domain chess:Position .
  chess:rank rdfs:range xsd:integer .
  chess:file rdfs:domain chess:Position .
  chess:file rdfs:ange xsd:string .

The advantage of the datatype route is that it provides a notion of equality 
that cannot be represented in RDF for the class route.  The advantage of the 
latter is that it provides more flexibility, being able to represent chess 
positions whose precise location is not known.  The latter route also makes it 
possible to add more information about chess positions, integrated with the 
chess:Position class.  A disadvantage of the datatype route is that a 
particular RDF system mignt not implement the chess:position datatype.

It is actually possible to have both and connect them via the triple

  chess:Position rdfs:subClassOf chess:position .

but I don't think that this produces much in the way of benefits in any 
existing RDF system.


What does not work as well in general is using containers or collections to 
represent entities that are not containers or collections.  Even if subclasses 
and subproperties are used to provide special containers or collections the 
facilities in RDF are inadequate to describe how the representation works 
resulting in intended meaning documents that have little or no RDF content and 
provide little benefit in RDF systems.

What works least well is using containers or collections without subclasses 
and subproperties.  Just thinking about (or, more likely, only discussing in a 
small group) representing something using collections does not actually 
represent that thing using collections, at least as far as the RDF community 
as a whole is concerned.  To have uptake of a representation scheme requires 
using agreed-on methods, and for RDF this means using new vocabulary and 
creating defining documents.

peter


As yet another aside, thinking back on the development of OWL, it might have 
been better to use neither containers or collections to represent OWL syntax 
in triples, instead defining classes and properties for the various bits of 
OWL syntax.  I'm not sure whether this was even considered back then.
Received on Saturday, 1 October 2022 13:52:22 UTC