RE: Proposed RDF FHIR syntax feedback from Anthony Mallia on 2015-03-08 (public-semweb-lifesci@w3.org from March 2015)

From: Anthony Mallia <amallia@edmondsci.com>
Date: Sun, 8 Mar 2015 14:00:48 +0000
To: David Booth <david@dbooth.org>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
CC: HL7 ITS <its@lists.hl7.org>, w3c semweb HCLS <public-semweb-lifesci@w3.org>
Message-ID: <D5F9B7889182464788941B4EEDE3E81FFD4AA22E@Awacs.esci.com>
David,

I believe that this is an important aspect to distinguish between the type or TBox and the instance or ABox. A simple justification is that they come from different authorities (and end points) - HL7 or an EHR system. 

However I would strongly recommend that we DO NOT REDEFINE Ontology from its definition in the W3C specs - this will cause major confusion.

Here is the extract from OWL2:
"OWL 2 ontologies provide classes, properties, individuals, and data values and are stored as Semantic Web documents. OWL 2 ontologies can be used along with information written in RDF, and OWL 2 ontologies themselves are primarily exchanged as RDF documents."

So I am recommending two subtypes of Ontology :
INSTANCE ONTOLOGY (INSTANCE for short) contains Individuals, their Property assertions and their data values but may refer to contents of MODEL(s)
MODEL ONTOLOGY (MODEL for short) contains Classes, ObjectProperties, DataProperties and Datatypes

INSTANCE and MODEL are disjoint but there can be Ontologies (neither of these subtypes) which combine them through merge or import and would be used for reasoning.
It should not be necessary to separate these two by MIME type - they will be handled quite differently e.g. import statements will know exactly what they are trying to do.

Thanks for bring up the topic - I don't believe that this renaming contradicts your intentions and I think it is important to get these concept semantics nailed down early.

Regards,


Tony Mallia
EDMOND SCIENTIFIC COMPANY (ESC)



-----Original Message-----
From: David Booth [mailto:david@dbooth.org] 
Sent: Saturday, March 07, 2015 8:29 PM
To: public-semweb-lifesci@w3.org
Cc: HL7 ITS; w3c semweb HCLS
Subject: Re: Proposed RDF FHIR syntax feedback

There are a few things going on here that I think are causing some confusion.  One is discussion of RDF serializations (syntax).  Another is discussion of ontologies (i.e., data models or TBox) versus instance data (i.e., ABox, or data that is expressed in terms of those data 
models or ontologies).   A third is discussion of dereferenceable FHIR 
URIs.  I'll try to help untangle them, but first I'd like to suggest some simple terminology to help reduce confusion in these discussions.

ONTOLOGY:  I suggest we use the word "ontology" when we are talking about the definitions of classes and properties, relationships between them or restrictions on their use, such as cardinality.

INSTANCE DATA: Similarly, I suggest we use the term "instance data" when we are talking about data that is represented *using* those classes and 
properties.   An example would be specific patient data (such as an 
observation) that is transmitted in a FHIR payload.

I think this "ontology" versus "instance data" dichotomy will help clarify our discussions.  HOWEVER, there are several circumstances that cause this distinction to be blurred:

  - RDF itself makes no distinction between ontologies and instance data (TBox and ABox) -- it's all just sets of assertions to RDF.  "Triples 
all the way down."   :)

  - RDF file formats are *not* a reliable indicator of whether a file contains an ontology, instance data or a combination of both.  A .rdf file (RDF/XML) can hold OWL ontology definitions, as can a .ttl (Turtle) file or any other standard RDF serialization.  To add even more confusion, if you're using a tool like Protege, the tool might store everything in .owl files, regardless of whether the data is acting as 
ontologies or as instance data.   The .owl extension does *not* 
necessarily mean the file contains an ontology (as defined above).

  - Terms from OWL and RDFS vocabularies can be freely intermingled in an RDF document -- and they typically are, especially when that document acts as an ontology.

  - FHIR profile definitions can be transmitted in a FHIR payload just as patient data can be transmitted.  In that sense a FHIR profile can act like instance data, but in its use -- defining extensions and constraining the content of other FHIR resources -- it acts more like an ontology.

For FHIR, we need to define both a FHIR *ontology* -- a set of classes and properties -- and bi-directional mappings that will convert FHIR
*instance* *data* from FHIR XML or FHIR JSON to FHIR RDF and vice versa.

Because RDF is independent of serialization, file formats and serializations are largely irrelevant to our FHIR RDF/ontology effort: 
we'll be producing a FHIR ontology, using standard RDF, RDFS and OWL vocabularies, and it can be serialized to any standard RDF format.  For this reason, I don't think we should spend much time worrying about what RDF serialization to use for the FHIR ontology.  It's pretty much irrelevant.

However, for FHIR RDF *mappings*, for convenience we may choose to define those mappings in terms of specific FHIR XML, FHIR JSON and/or FHIR RDF serializations.  For example, the Shape Expressions (ShEx) approach that Eric Prud'hommeaux demonstrated transforms FHIR *XML* to *Turtle*.  And in the JSON-LD approach that I'm investigating, the mapping from JSON-LD to RDF will simply be the standard RDF interpretation of the JSON-LD: no additional mapping definition will be required.

In summary: (a) RDF serializations can hold a mixture of RDF, RDFS and/or OWL -- and they often do; and (b) the serialization format is independent of whether the document contains an ontology or instance data or both.

David Booth
Received on Sunday, 8 March 2015 13:59:57 UTC