Re: Proposed RDF FHIR syntax feedback from David Booth on 2015-03-08 (public-semweb-lifesci@w3.org from March 2015)

From: David Booth <david@dbooth.org>
Date: Sat, 07 Mar 2015 20:24:34 -0500
CC: HL7 ITS <its@lists.hl7.org>, w3c semweb HCLS <public-semweb-lifesci@w3.org>
Message-ID: <54FBA4D2.5020609@dbooth.org>
There are a few things going on here that I think are causing some 
confusion.  One is discussion of RDF serializations (syntax).  Another 
is discussion of ontologies (i.e., data models or TBox) versus instance 
data (i.e., ABox, or data that is expressed in terms of those data 
models or ontologies).   A third is discussion of dereferenceable FHIR 
URIs.  I'll try to help untangle them, but first I'd like to suggest 
some simple terminology to help reduce confusion in these discussions.

ONTOLOGY:  I suggest we use the word "ontology" when we are talking 
about the definitions of classes and properties, relationships between 
them or restrictions on their use, such as cardinality.

INSTANCE DATA: Similarly, I suggest we use the term "instance data" when 
we are talking about data that is represented *using* those classes and 
properties.   An example would be specific patient data (such as an 
observation) that is transmitted in a FHIR payload.

I think this "ontology" versus "instance data" dichotomy will help 
clarify our discussions.  HOWEVER, there are several circumstances that 
cause this distinction to be blurred:

  - RDF itself makes no distinction between ontologies and instance data 
(TBox and ABox) -- it's all just sets of assertions to RDF.  "Triples 
all the way down."   :)

  - RDF file formats are *not* a reliable indicator of whether a file 
contains an ontology, instance data or a combination of both.  A .rdf 
file (RDF/XML) can hold OWL ontology definitions, as can a .ttl (Turtle) 
file or any other standard RDF serialization.  To add even more 
confusion, if you're using a tool like Protege, the tool might store 
everything in .owl files, regardless of whether the data is acting as 
ontologies or as instance data.   The .owl extension does *not* 
necessarily mean the file contains an ontology (as defined above).

  - Terms from OWL and RDFS vocabularies can be freely intermingled in 
an RDF document -- and they typically are, especially when that document 
acts as an ontology.

  - FHIR profile definitions can be transmitted in a FHIR payload just 
as patient data can be transmitted.  In that sense a FHIR profile can 
act like instance data, but in its use -- defining extensions and 
constraining the content of other FHIR resources -- it acts more like an 
ontology.

For FHIR, we need to define both a FHIR *ontology* -- a set of classes 
and properties -- and bi-directional mappings that will convert FHIR 
*instance* *data* from FHIR XML or FHIR JSON to FHIR RDF and vice versa.

Because RDF is independent of serialization, file formats and 
serializations are largely irrelevant to our FHIR RDF/ontology effort: 
we'll be producing a FHIR ontology, using standard RDF, RDFS and OWL 
vocabularies, and it can be serialized to any standard RDF format.  For 
this reason, I don't think we should spend much time worrying about what 
RDF serialization to use for the FHIR ontology.  It's pretty much 
irrelevant.

However, for FHIR RDF *mappings*, for convenience we may choose to 
define those mappings in terms of specific FHIR XML, FHIR JSON and/or 
FHIR RDF serializations.  For example, the Shape Expressions (ShEx) 
approach that Eric Prud'hommeaux demonstrated transforms FHIR *XML* to 
*Turtle*.  And in the JSON-LD approach that I'm investigating, the 
mapping from JSON-LD to RDF will simply be the standard RDF 
interpretation of the JSON-LD: no additional mapping definition will be 
required.

In summary: (a) RDF serializations can hold a mixture of RDF, RDFS 
and/or OWL -- and they often do; and (b) the serialization format is 
independent of whether the document contains an ontology or instance 
data or both.

David Booth
Received on Sunday, 8 March 2015 01:25:09 UTC