- From: Timothy W. Cook <tim@mlhim.org>
- Date: Mon, 10 Mar 2014 18:57:26 -0300
- To: Michael Brunnbauer <brunni@netestate.de>
- Cc: semantic-web <semantic-web@w3.org>
- Message-ID: <CA+=OU3U8Eh9NdK2Bkk1z1F2W9i5uJQ7_Pqfj=cLF7Ndv_nTXpQ@mail.gmail.com>
We do not disagree that RDF has a much cleaner approach to creating meaningful connections between information components. As I stated earlier, I originally thought that I would use RDF and/or OWL to build MLHIM. The reality is that it just isn't robust enough in some aspects nor is the eco-system around it mature enough; yet. I have stated in papers and social media posts that I assume MLHIM 3.x will be RDF. I still believe that, but we do not know if that will be in 5 years or 10 or even longer. On Mon, Mar 10, 2014 at 6:15 PM, Michael Brunnbauer <brunni@netestate.de>wrote: > > Hello Timothy, > > I am not a friend of the data model / ontology distinction but I will use > it > here: A data model generally has less semantics, reusability and explicit > knowledge than an ontology. > > You can map XML Schema to OWL automatically but what you have then is still > more data model than ontology. > > With your approach, the step from data model to ontology is discreet while > with RDF, it would be continuous. > > Regards, > > Michael Brunnbauer > > On Sun, Mar 09, 2014 at 05:59:49PM -0300, Timothy W. Cook wrote: > > On Sun, Mar 9, 2014 at 11:48 AM, Michael Brunnbauer <brunni@netestate.de > >wrote: > > > > > > > > Hello Timothy, > > > > > > MLHIM seems to be annotated data models - with optional RDF > annotations. > > > > > > Somewhat, but the models are are restrictions of a common reference > model. > > Each model represents a concept that is as broad or narrow as the > modeller > > chooses. The annotations must be optional. It is up to the domain > > experts/knowledge modellers to determine the resultant quality. > > > > > > > > > > > The claims regarding interoperability and semantics are a bit > exaggerated, > > > IMO. > > > > > > > > > > I suppose your opinion will change when you decide to put some study into > > the matter. > > > > > > > > > If we had something like annotated portable RDB schemas, would they > carry > > > less > > > meaning and would applications built with them be less interoperable > than > > > with > > > MLHIM? > > > > > > > > If you were able to share those concept models between applications and > > they were restrictions of a common reference model; then yes they would > be > > the same. > > > > > > > > > In order to make applications completely interoperable and remove all > > > implicit semantics from their code, you have to abolish them - > replacing > > > them > > > with some standard component. This is probably as futile as the > > > ontology/data > > > model to rule them all. > > > > > > > Further study will show that there are paths to operate along in the > > interim. But yes, the eventual goal would be for a common healthcare > > reference model. > > > > > > > > > > I agree that the proposition of XML Schema is alluring: The information > > > about > > > the data model used and how to validate the data is always present and > the > > > tools for validation are already there. > > > > > > You did not use RDF because it has no standard way to do this - which > is > > > unfortunate. > > > > > > > It is unfortunate. After working with the openEHR Foundation on > > multi-level modelling for a decade using a domain specific language it > was > > an easy realization that a relatively small group of people could not > > create high quality tools needed for a DSL; in any reasonable amount of > > time. > > I began looking for alternatives. OWL and RDF would be my first choices > > for implementation. They just weren't and still aren't mature enough to > do > > everything needed. Remember as I stated before; the MLHIM reference > model > > is a conceptual information model. I choose XML because I did not see > > anything with that capability and widespread adoption. I knew very little > > about XML Schema prior to this. So I did not choose it because it was my > > hammer already. I spent a lot of time on a lang learning curve and had > to > > wait for tools to catch up to XML Schema 1.1 > > > > > > > > You could have created a way and tools to do this in RDF. Did you fear > the > > > necessary effort or the risk to adoption? > > > > > > > (see above) > > Given, time talent and money; openEHR could do it with the Archetype > > Definition Language. But it would never be as ubiquitous as XML. > > > > > > > It seems that XML Schema allows vocabulary reuse down to the > > > property/attribute > > > level - but the temptation to create own terms instead of reusing > others > > > seems > > > to be greater than with RDF. Having some of the semantics in the XML > Schema > > > layer and more of it in the RDF layer on top of it definitely is a > > > drawback. > > > > > > > > There may be other/additional approaches that may help improve MLHIM. I > am > > certainly open to and welcome dialog about it. The specifications (such > > that they are at this point) are openly available under a Creative > Commons > > license. Feel free to join the discussion on social media (Google Plus > > preferred). > > > > > > > > > How many implementors will just ignore the optional RDF layer? > > > > > > > You must realize that software developers do not have control of the > models > > in this approach. Domain experts that understand a little bit of how to > > use the CCD-Gen are the ones responsible for building the models. In the > > process of teaching them this activity, they are also taught the > importance > > of the quality of their models and it ultimately decides the quality of > > their data. > > > > The MLHIM eco-system allows for closed loop concept models( CCDs) to be > > developed as well as openly licensed CCDs. There may eventually be > 10,000 > > blood pressure CCDs in the open. But like most things, we predict that > > most people will reuse a model that is good and openly available, instead > > of building their own. > > > > I can't decide for the experts nor do I want to control what is or is > not a > > good model for any particular implementation. All I can do is offer > them a > > real solution that is bottom up and under their control instead of slow > > moving international standards bodies that can't keep up with the > changing > > science. > > > > Thanks for your feedback. Explaining MLHIM in words is always a > learning > > experience for me. > > > > Regards, > > Tim > > > > > > > > > > > > > > > > Regards, > > > > > > Michael Brunnbauer > > > > > > On Sat, Mar 08, 2014 at 06:36:54PM -0300, Timothy W. Cook wrote: > > > > A very interesting and I think, foundational discussion. David, > thanks > > > for > > > > bringing it up. > > > > Below is a discussion of why I believe that RDF should be considered > a > > > > layer over data models or maybe as 'semantic glue'. > > > > > > > > David, we are working on the same type of problem but from slightly > > > > different perspectives. The presentation that you linked to > re:KnowMED, > > > is > > > > very important and I recall seeing it before. I'll take this > opportunity > > > > to comment on it since it is in the context of this discussion. The > > > > indicates that you propse RDF as a language to be used in the > exchange of > > > > healthcare data. Then on slide #5 you say it isn't enough to 'get us > > > > there'. So I am not sure how much of this is marketing swagger and > how > > > > much is hard fact. > > > > > > > > On slide #8 item #2 we are 100% in agreement. But then on slide #9 > you > > > > are mixing apples and oranges. XML and RDF have two different > purposes > > > > that work well together. > > > > > > > > On further slides, your Blue, Green and Red customers exactly > indicate > > > > what I mean by RDF being an essential layer on top of multiple > models. > > > > > > > > What happens further in the presentation is where we disagree. You > > > assert > > > > that RDF should be the language used to actually 'exchange' data. > This > > > > where RDF and the tools around it (AFAIK) are not mature enough to > > > perform. > > > > Several times you have mentioned 'semantics and not syntax'. This > is a > > > > huge mistake. You must have both in order to insure data quality and > > > > meaning. Secondly we know from history that top-down consensus in > > > > healthcare concept modelling is an impossibility.[1] > > > > > > > > In your post describing the BP screenshot you said: > > > > "Thus, although ex1:bp_023 and ex2:bp409 capture the same blood > pressure > > > > information, they represent that information differently. > Nonetheless, > > > > both representations can peacefully coexist in the same merged RDF > data > > > > without conflict, which might happen, for example, if one is derived > from > > > > the other through inference." > > > > I take this to mean that you are representing the exact same BP > > > measurement > > > > data in two different ways? Your use case, 'by inference' is a > little > > > > fuzzy for me. If it is derivation by inference, it will just be an > in > > > > memory representation and not persisted; correct? Irregardless, the > > > > existence of the same data instance, in the same application is in > > > complete > > > > contradiction to good data quality management. As you go on to > explain, > > > > now you must add application intelligence to analyze whether or not > two > > > > data instances are the same or not to avoid counting them as two > separate > > > > instances. This is approach is very dangerous, in addition to adding > > > > complexity and cost to the applications. However, having the > ability > > > to > > > > determine if two different data instances exactly match the same > concept > > > is > > > > essential. Minor differences such as the position of the patient > > > (stitting > > > > or prone) or the type of instrument used to perform the measurement > or > > > the > > > > location on the body (left upper arm or right thigh, etc.) that the > > > > measurement was taken are all important. They may or may not rule > in or > > > > out specific measurements, based on the intended use of the query > > > results. > > > > This is where RDF is essential, do these two instances point to > exactly > > > > the same code in a controlled vocabulary, etc.? These questions > are > > > > essential to having the ability to perform machine based reasoning > over > > > the > > > > data repository; whether at the point of care or for research > purposes. > > > > > > > > Refering back for a moment, to 'the same data instance' situation. > It is > > > > essential to have additional information (meta-data) to determine if > two > > > > instances are are exactly the same. This can legitimately occur > during > > > > aggregation for research or systemic quality analysis. Unique > patient > > > > identifiers along with datetime stamps are ideal. However, the > patient > > > > identifier issue is an ongoing problem that is actually > implementation > > > > context and application specific. It is outside of the context of > data > > > > quality and management. > > > > > > > > Slide #22 clearly indicates that there is an expectation that RDF is > > > used > > > > as a common format. However, as I said earlier, the current > > > implementation > > > > of RDF is not robust enough to perform this function, UNLESS, there > is a > > > > global expert consensus on all healthcare concepts so that models > may be > > > > created and distributed from a central authority. This is simply > > > > unrealistic as history has shown and is formalized in the > Cavalini-Cook > > > > theory [1]. > > > > > > > > The reason that I state that RDF is not capable, at this point of > > > maturity, > > > > is that it doesn't support the ability to represent syntactic > structures > > > in > > > > a multi-level model environment. IOW: There is no ability (AFAIK) to > > > > express a common reference model and then derive concepts models that > > > issue > > > > further constraints. A multi-level model approach is essential in > order > > > to > > > > abstract the syntax and semantics of each concept out of the > application > > > > source code and repository schemas so that they can be shared between > > > > disparate applications. This is what provides for full syntactic and > > > > semantic interoperability. > > > > > > > > A multi-level model approach may or may not be useful in many > domains. > > > > Specifically, human engineered domains that we fully understand can > be > > > > modeled as one level representations. However, biological domains > that > > > > involve evolutionary complexity are quite different. Primarily > because > > > we > > > > do not fully understand them so our science and understanding is > > > constantly > > > > changing. Additionally, it appears that the data has a much longer > > > > lifetime of significance than other domains. Therefore the data > should > > > be > > > > initially captured and represented in a manner that makes it as > future > > > > proof and reusable as possible. In healthcare, the most semantically > > > rich > > > > point of any information is at the point of care. Every point of > > > > transition/translation after that will most assuredly lose context. > As a > > > > brief example; reference ranges for conditions change over time. It > is > > > > essential that data captured today be expressed in the context of > today's > > > > knowledge, even 20 or more years from now. The concept model around > high > > > > blood pressure is different than it was 10 years ago. > > > > > > > > Where RDF shines is that in a syntactic model of a concept designed > to > > > > capture reference ranges and other metadata, it can be used to > provide > > > > external semantic context to that model. Whether that context > exists in > > > a > > > > controlled vocabulary or even free text documents such as clinical > > > > guidelines. > > > > > > > > In the Multi-Level Healthcare Information Modelling (MLHIM) approach > we > > > > developed a conceptual reference model to provide a basis for > software > > > > implementations. While the MLHIM model doesn't preclude other > > > > serializations, we found that XML Schema 1.1 does provide the > > > prerequisites > > > > for implementation both a reference model and concepts models. This > > > means > > > > that we can have full validation of instance data back to the W3C > > > > specifications. By marking up the concept models (XML Schema 1.1 > > > > annotations) with RDF providing the computable semantic links for > each > > > > model as defined by the modeller. These models can now be created by > > > > domain experts (with additional knowledge modelling training) so that > > > > software developers do not have to interpret the meanings. > > > > > > > > The concept models are now fully detached from any specific > > > implementation > > > > and can be shared to use for validating instance data in the context > in > > > > which it was recorded. I believe that this is the closest we have to > > > > semantic interoperability, to date. I am of course open for > discussion > > > and > > > > debate on the issue. I used the acronym 'AFAIK' a few times above. > I > > > used > > > > this because my last serious attempt to use RDF for this purpose was > in > > > > 2010/2011. I know that there is a continuous maturing process going > on. > > > I > > > > believe that there may come a day when RDF and OWL can be used > > > exclusively > > > > for syntactic and semantic representation and reasoning. But AFAIK, > not > > > > today. > > > > > > > > We have a significant number of peer-reviewed publications about > MLHIM > > > and > > > > academic as well as other implementations. I am happy to share those > with > > > > the group or you may peruse the links in my signature line as well as > > > > www.mlhim.org and the specs are openly downloadable from here[2] as > a > > > > package and as source from here [3]. > > > > > > > > We also have almost 2000 datatypes converted from other modeling > > > > approaches (such as the NIH CDE browser and HL7 FHIR) into reusable > > > > complexTypes to be used in concept models. You can review those as > well > > > as > > > > download some example concept models from here[4]. Free > registration is > > > > required to download the models. > > > > > > > > Kind Regards, > > > > Tim > > > > > > > > > > > > [1] > > > > > > > > https://github.com/mlhim/specs/blob/2_4_3/graphics/cavalini_cook_theory.png > > > > [2] > > > > > > > > https://launchpad.net/mlhim-specs/2.0/2.4.3/+download/mlhim-specs-2013-10-15-2.4.3-Release.zip > > > > [3] https://github.com/mlhim/ > > > > [4] http://www.ccdgen.com > > > > > > > > > > > > > > > > > > > > On Fri, Mar 7, 2014 at 5:00 PM, David Booth <david@dbooth.org> > wrote: > > > > > > > > > Hi Alan, > > > > > > > > > > > > > > > On 03/07/2014 12:44 PM, Alan Ruttenberg wrote: > > > > > > > > > >> Can you explain what you mean by "RDF's ability to allow multiple > data > > > > >> models to peacefully coexist, interconnected, in the same data" ? > > > > >> > > > > > > > > > > Yes. Here is an imprecise illustration, on slides 10-17: > > > > > > > > > http://dbooth.org/2013/semtech/slides/03-DavidBooth-rdf-as-universal.pdf > > > > > (I took some artistic liberties blurring class/instance > distinctions in > > > > > that diagram.) > > > > > > > > > > And here is a more precise example that cleanly distinguishes > classes > > > from > > > > > instances: > > > > > http://tinyurl.com/pzsgf7f > > > > > (I've also attached the same illustration, for offline readers.) > > > > > > > > > > In this latter example (of a hypothetical systolic blood pressure > > > > > measurement), the same information is represented according to two > > > > > different models/schemas/vocabularies/ontologies, v1 (green) and v2 > > > > > (red). (I am using the terms model, schema, vocabulary and > ontology > > > > > loosely and somewhat interchangeably here.) > > > > > > > > > > In the v1 model, the systolic blood pressure is indicated in RDF > like > > > this: > > > > > > > > > > ex:patient319 foaf:name "John Doe" ; > > > > > v1:bps ex1:bp_023 . > > > > > > > > > > ex1:bp_023 a v1:SystolicBPSitting_mmHg ; > > > > > v1:value 120 . > > > > > > > > > > Whereas in the v2 model, the same information is represented > > > differently, > > > > > in RDF like this: > > > > > > > > > > ex:patient319 foaf:name "John Doe" ; > > > > > v2:bps ex2:bp_409 . > > > > > > > > > > ex2:bp_409 a v2:SystolicBP ; > > > > > v2:pressure 120 ; > > > > > v2:units v2:mmHg ; > > > > > v2:bodyPosition v2:sitting . > > > > > > > > > > Thus, although ex1:bp_023 and ex2:bp409 capture the same blood > pressure > > > > > information, they represent that information differently. > Nonetheless, > > > > > both representations can peacefully coexist in the same merged RDF > data > > > > > without conflict, which might happen, for example, if one is > derived > > > from > > > > > the other through inference. > > > > > > > > > > Furthermore, the relationship between these classes, > > > > > v1:SystolicBPSitting_mmHg and v2:SystolicBP, and hence the > relationship > > > > > between the corresponding v1 and v2 instance data, can also be > > > explicitly > > > > > captured in RDF, as the v1v2:SystolicBP_Transform (yellow) > > > relationship: > > > > > > > > > > v1:SystolicBPSitting_mmHg v1v2:SystolicBP_Transform > v2:SystolicBP . > > > > > > > > > > Inference rules for v1v2:SystolicBP_Transform could therefore > convert a > > > > > v1:SystolicBPSitting_mmHg measurement to a v2:SystolicBP > measurement or > > > > > vice versa. > > > > > > > > > > This example only illustrated the case where the transformation > from > > > one > > > > > model to the other is lossless and thus reversible. Usually that > > > isn't the > > > > > case. Relating models and transforming between them is *not* easy, > > > but at > > > > > least RDF makes it possible to explicitly indicate these > relationships. > > > > > > > > > > Obviously some intelligence must be exercised to avoid, for > example, > > > > > accidentally thinking that ex:bp_023 and ex2:bp_409 represent two > > > distinct > > > > > blood pressure measurements, and thereby double counting them, but > > > that's > > > > > easy enough to do. > > > > > > > > > > Also, there isn't always a desire to relate or transform between > > > models. > > > > > Sometimes some data is related and other data is not, and it is > all > > > still > > > > > merged into the same RDF graph. In fact, the point may be to > connect > > > that > > > > > part of the data that *is* related and let the rest coexist without > > > being > > > > > connected (or at least not *directly* connected). > > > > > > > > > > The point is that these data models can peacefully coexist in RDF > data > > > > > without conflict: applications using the v1 model against the > merged > > > data > > > > > might only see v1 instance data, whereas applications using the v2 > > > model > > > > > might only see the v2 data. That's qualitatively different than > in the > > > > > world of XML, for example, where one schema generally wants to be > "on > > > top", > > > > > and when you merge XML of different schemas, you need to create a > new > > > "top" > > > > > schema. That is the difference that I have so often tried to > explain > > > to > > > > > people outside the RDF community, and what I am trying to capture > > > > > succinctly in a term or phrase. It isn't an easy idea to convey > to > > > those > > > > > who are accustomed to a schema-centric approach. I think a catchy > but > > > > > descriptive term or phrase could help. > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > > > > > >> -Alan > > > > >> > > > > >> > > > > >> On Fri, Mar 7, 2014 at 11:20 AM, David Booth <david@dbooth.org > > > > >> <mailto:david@dbooth.org>> wrote: > > > > >> > > > > >> I -- and I'm sure many others -- have struggled for years > trying > > > to > > > > >> succinctly describe RDF's ability to allow multiple data > models to > > > > >> peacefully coexist, interconnected, in the same data. For > data > > > > >> integration, this is a key strength of RDF that distinguishes > it > > > > >> from other information representation languages such as XML. > I > > > > >> have tried various terms over the years -- most recently > "schema > > > > >> promiscuous" -- but have not yet found one that I think really > > > nails > > > > >> it, so I would love to get other people's thoughts. > > > > >> > > > > >> This google doc lists several candidate terms, some pros and > cons, > > > > >> and allows you to indicate which ones you like best: > > > > >> http://goo.gl/zrXQgj > > > > >> > > > > >> Please have a look and indicate your favorite(s). You may > also > > > add > > > > >> more ideas and comments to it. The document can be edited by > > > anyone > > > > >> with the URL. > > > > >> > > > > >> Thanks! > > > > >> David Booth > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > -- > > > > MLHIM VIP Signup: http://goo.gl/22B0U > > > > ============================================ > > > > Timothy Cook, MSc +55 21 994711995 > > > > MLHIM http://www.mlhim.org > > > > Like Us on FB: https://www.facebook.com/mlhim2 > > > > Circle us on G+: http://goo.gl/44EV5 > > > > Google Scholar: http://goo.gl/MMZ1o > > > > LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook > > > > > > -- > > > ++ Michael Brunnbauer > > > ++ netEstate GmbH > > > ++ Geisenhausener Straße 11a > > > ++ 81379 München > > > ++ Tel +49 89 32 19 77 80 > > > ++ Fax +49 89 32 19 77 89 > > > ++ E-Mail brunni@netestate.de > > > ++ http://www.netestate.de/ > > > ++ > > > ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) > > > ++ USt-IdNr. DE221033342 > > > ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer > > > ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel > > > > > > > > > > > -- > > MLHIM VIP Signup: http://goo.gl/22B0U > > ============================================ > > Timothy Cook, MSc +55 21 994711995 > > MLHIM http://www.mlhim.org > > Like Us on FB: https://www.facebook.com/mlhim2 > > Circle us on G+: http://goo.gl/44EV5 > > Google Scholar: http://goo.gl/MMZ1o > > LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook > > -- > ++ Michael Brunnbauer > ++ netEstate GmbH > ++ Geisenhausener Straße 11a > ++ 81379 München > ++ Tel +49 89 32 19 77 80 > ++ Fax +49 89 32 19 77 89 > ++ E-Mail brunni@netestate.de > ++ http://www.netestate.de/ > ++ > ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) > ++ USt-IdNr. DE221033342 > ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer > ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel > -- MLHIM VIP Signup: http://goo.gl/22B0U ============================================ Timothy Cook, MSc +55 21 994711995 MLHIM http://www.mlhim.org Like Us on FB: https://www.facebook.com/mlhim2 Circle us on G+: http://goo.gl/44EV5 Google Scholar: http://goo.gl/MMZ1o LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook
Received on Monday, 10 March 2014 21:57:56 UTC