Re: Proposed RDF FHIR syntax feedback from Pat Hayes on 2015-03-08 (public-semweb-lifesci@w3.org from March 2015)

From: Pat Hayes <phayes@ihmc.us>
Date: Sun, 8 Mar 2015 13:31:20 -0500
To: Anthony Mallia <amallia@edmondsci.com>
Cc: David Booth <david@dbooth.org>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>, HL7 ITS <its@lists.hl7.org>
Message-Id: <D759AEB2-4DF7-4C7F-8743-EE57C9A9FD11@ihmc.us>
Comments in-line:

On Mar 8, 2015, at 9:00 AM, Anthony Mallia <amallia@edmondsci.com> wrote:

> David,
> 
> I believe that this is an important aspect to distinguish between the type or TBox and the instance or ABox. A simple justification is that they come from different authorities (and end points) - HL7 or an EHR system. 

If there is any other reason to distinguish them, please list as many of them as you can. If this is the only reason, I would strongly suggest that it is not a sufficient reason for introducing this rigid distinction into the foundation. It would be better to provide a mechanism to allow the kind of originating authority to be specified explicitly. The question to ask is, what utility in actual processing will arise from having this distinction rigidly enforced? The problems it (artificially) introduces is that it makes most OWL2 ontologies unclassifiable, since many of them contain both class and instance data: in fact, OWL2 punning makes this very distinction rather hard to detect, since a class in OWL 2 may itself be an instance; and it forces users to make a needless classification decision which may give rise to errors and difficulties in processing. 

> However I would strongly recommend that we DO NOT REDEFINE Ontology from its definition in the W3C specs - this will cause major confusion.
> 
> Here is the extract from OWL2:
> "OWL 2 ontologies provide classes, properties, individuals, and data values and are stored as Semantic Web documents. OWL 2 ontologies can be used along with information written in RDF, and OWL 2 ontologies themselves are primarily exchanged as RDF documents."

That defines an OWL2 ontology. If you are planning to use other representation languages, I would suggest adopting a wider definition of the bare concept of 'ontology'. By the way, this topic - how to define 'ontology' - was discussed in depth for a year in the Ontolog forum. I recommend reading http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007_Communique and the surrounding discussions before coming to a decision.

> So I am recommending two subtypes of Ontology :
> INSTANCE ONTOLOGY (INSTANCE for short) contains Individuals, their Property assertions and their data values but may refer to contents of MODEL(s)

I think you mean it contains individual *names*, right? 

When you say 'may refer to', what distinction are you making between 'refer to' and 'contain'? Do you mean it will not contain the *definitions* of the classes, etc.? But there is no concept of 'definition' in the RDF/OWL world. 

> MODEL ONTOLOGY (MODEL for short) contains Classes, ObjectProperties, DataProperties and Datatypes

And what will you do with something which contains large amounts of instance data, described using a mixture of vocabulary from a number of other ontologies and a small number of class and property definitions local to it? Because this is, if anything, the normal situation in Web-based ontology work.

> 
> INSTANCE and MODEL are disjoint

Which, if enforced, is going to create errors and blocks to processing for no functional reason. Why do this? It is a bad design decision to introduce distinctions that have no utility other than to be enforced and generate error messages. If this is a genuine type distinction, then you should be able to say what reasons there are for a processor to know what type an ontology is. How will an INSTANCE be processed differently from a MODEL? 

> but there can be Ontologies (neither of these subtypes) which combine them through merge or import and would be used for reasoning.
> It should not be necessary to separate these two by MIME type - they will be handled quite differently e.g. import statements will know exactly what they are trying to do.

importing is completely transparent to this distinction. Both of them (and any hybrids) will be imported in the same way using the same mechanisms. This is part of the RDF/OWL design. 

Pat Hayes

> 
> Thanks for bring up the topic - I don't believe that this renaming contradicts your intentions and I think it is important to get these concept semantics nailed down early.
> 
> Regards,
> 
> 
> Tony Mallia
> EDMOND SCIENTIFIC COMPANY (ESC)
> 
> 
> 
> -----Original Message-----
> From: David Booth [mailto:david@dbooth.org] 
> Sent: Saturday, March 07, 2015 8:29 PM
> To: public-semweb-lifesci@w3.org
> Cc: HL7 ITS; w3c semweb HCLS
> Subject: Re: Proposed RDF FHIR syntax feedback
> 
> There are a few things going on here that I think are causing some confusion.  One is discussion of RDF serializations (syntax).  Another is discussion of ontologies (i.e., data models or TBox) versus instance data (i.e., ABox, or data that is expressed in terms of those data 
> models or ontologies).   A third is discussion of dereferenceable FHIR 
> URIs.  I'll try to help untangle them, but first I'd like to suggest some simple terminology to help reduce confusion in these discussions.
> 
> ONTOLOGY:  I suggest we use the word "ontology" when we are talking about the definitions of classes and properties, relationships between them or restrictions on their use, such as cardinality.
> 
> INSTANCE DATA: Similarly, I suggest we use the term "instance data" when we are talking about data that is represented *using* those classes and 
> properties.   An example would be specific patient data (such as an 
> observation) that is transmitted in a FHIR payload.
> 
> I think this "ontology" versus "instance data" dichotomy will help clarify our discussions.  HOWEVER, there are several circumstances that cause this distinction to be blurred:
> 
>  - RDF itself makes no distinction between ontologies and instance data (TBox and ABox) -- it's all just sets of assertions to RDF.  "Triples 
> all the way down."   :)
> 
>  - RDF file formats are *not* a reliable indicator of whether a file contains an ontology, instance data or a combination of both.  A .rdf file (RDF/XML) can hold OWL ontology definitions, as can a .ttl (Turtle) file or any other standard RDF serialization.  To add even more confusion, if you're using a tool like Protege, the tool might store everything in .owl files, regardless of whether the data is acting as 
> ontologies or as instance data.   The .owl extension does *not* 
> necessarily mean the file contains an ontology (as defined above).
> 
>  - Terms from OWL and RDFS vocabularies can be freely intermingled in an RDF document -- and they typically are, especially when that document acts as an ontology.
> 
>  - FHIR profile definitions can be transmitted in a FHIR payload just as patient data can be transmitted.  In that sense a FHIR profile can act like instance data, but in its use -- defining extensions and constraining the content of other FHIR resources -- it acts more like an ontology.
> 
> For FHIR, we need to define both a FHIR *ontology* -- a set of classes and properties -- and bi-directional mappings that will convert FHIR
> *instance* *data* from FHIR XML or FHIR JSON to FHIR RDF and vice versa.
> 
> Because RDF is independent of serialization, file formats and serializations are largely irrelevant to our FHIR RDF/ontology effort: 
> we'll be producing a FHIR ontology, using standard RDF, RDFS and OWL vocabularies, and it can be serialized to any standard RDF format.  For this reason, I don't think we should spend much time worrying about what RDF serialization to use for the FHIR ontology.  It's pretty much irrelevant.
> 
> However, for FHIR RDF *mappings*, for convenience we may choose to define those mappings in terms of specific FHIR XML, FHIR JSON and/or FHIR RDF serializations.  For example, the Shape Expressions (ShEx) approach that Eric Prud'hommeaux demonstrated transforms FHIR *XML* to *Turtle*.  And in the JSON-LD approach that I'm investigating, the mapping from JSON-LD to RDF will simply be the standard RDF interpretation of the JSON-LD: no additional mapping definition will be required.
> 
> In summary: (a) RDF serializations can hold a mixture of RDF, RDFS and/or OWL -- and they often do; and (b) the serialization format is independent of whether the document contains an ontology or instance data or both.
> 
> David Booth
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Sunday, 8 March 2015 18:31:54 UTC