Re: DICOM RDF representation



On Mar 22, 2024, at 12:49 PM, David Booth <david@dbooth.org> wrote:

Hi Erich,

As food for thought, it occurred to me that there is another way you could think about the DICOM problem of needing to assert null values (missing data), which we discussed on yesterday's FHIR RDF call.[1]
Imagine that you could write something in RDF that corresponds to this, in json-ish syntax:

 { x: 25,
   y: 30,
   p: null }

That's saying that, in this particular chunk of data, there is no value asserted for p.  But it isn't saying that a value for p doesn't exist somewhere else in the data or in the universe.  In the RDF world, that sounds very much like it is not making an assertion about p's value.

However, I think it *is* making an assertion about the *schema* of the data, saying that p is part of the *schema*.  So in essence, I think it is using instance data syntax to assert something about the *schema*. So I wonder if it might work to somehow attach that schema information, when converting to RDF, instead of trying to assert a null p.  For starters, here is one straw man possibility, in Turtle-ish syntax:

 :something
   :x 25,
   :y 30,
   dicom:null :p .

This would have the benefit of still permitting a value for :p to be asserted, without conflict, which retaining the schema information about :something normally having a :p property.

In converting back from RDF to JSON or something else, the conversion could see if there is an asserted value for :p, such as 32.  If there is, it would assert that value:

   p: 32

If not, it would know to assert an empty value for p, such as one of these, depending on how you are representing null in json:

   p: ""
   p: null
   p: []

BTW, I mentioned on the call that another possibility is to use a distinguished value in RDF, to represent null, such as urn:null .  But I don't think this would play very well with inference, because if you asserted something like:

  :something :p urn:null .

and :p is normally supposed to hold an integer (for example), then urn:null would have to be in the value space of integer.  And if :p were later asserted to be 32, then an inference engine might conclude that urn:null = 32 (it :p can only have one value) unless it were augmented to handle urn:null specially.  I imagine folks like Pat Hayes thought about this long ago and concluded that a distinguished null value like this would be a Bad Idea, because it goes against the grain of description logic.

Yes, such folks did exactly that and came to that very conclusion. Any logic, not just description logics; and we had the debate with database engieers long before RDF was invented.

You note one problem with null, but the chief problem is that if 'null' is treated as a name, then it has to denote the same thing wherever it occurs, so any two entries with 'null' in them have the same value. Which I gather (I actually have no clear idea what "null" is supposed to mean, after years of trying to get people to tell me) is not what is intended. Your dicom:null property solution overcomes this objection very neatly (as long as nobody tries to create an ontology of that dicom:null property) and it has the merit, from the RDF perspective, of handing the question of what 'null' means back to dicom itself.

Maybe 'null' is not a name but rather an existential variable, with the convention that it is a different variable every time it occurs, ie each actual token of 'null' is a distinct variable. So then 'null' means something like "something, but we don't know what (yet)". In that case, in RDF the obvious solution is to use a unique bnode. Or a skolem constant if you dislike bnodes.

But the simplest solution is the one you mention first. If 'null' means that no data is available, why are you bothering to say it? Just leave those entries out of the RDF altogether. If this seems wrong, my response would be, what utility is served by including them? What do they make possible, that would not work if they were simply not there?

Best wishes

Pat Hayes


1. https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2F2024%2F03%2F21-hcls-minutes.html&data=05%7C02%7Cphayes%40ihmc.us%7C8bc5d3e64c984e78b69508dc4aa979b5%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C638467338984471646%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=O%2FD6EtLAdEMkMyfxdxx3H4ljyzx3is21rutxf%2FbaY5I%3D&reserved=0<https://www.w3.org/2024/03/21-hcls-minutes.html>

Thanks,
David Booth


On 1/10/24 13:21, Erich Bremer wrote:
Hi Detle,
On the delay in your response, you are completely forgiven. :-)
Thanks for the links!  I've read your paper and I have a few questions if I may ask:
1) Is your ontology generated automatically from the normative DICOM XML?
2) If so, is this process open-source?
3) You mentioned in the paper that you convert everything to strings, (not taking advantage of the value representations).  I take it that an all-string approach is a big performance hit. Have you (since then) converted the strings to something more analytically efficient?
4) Can you share any samples of your dicom2RDF conversions?
I've done my own conversion of DICOM to RDF (dcm2rdf) using (like you) the dcm4che library and scaled this to handle large sets of dcm files.  The code will generate either a long form conversion keeping VR typing:
<urn:md5:44c5f855d4ee27141c926b2084b461a4> dcm:00080060  [ dcm:Value "XA"; dcm:vr "CS" ] .
or a compact form dropping the VR typing and converting the actual values to an optimal form.
<urn:md5:44c5f855d4ee27141c926b2084b461a4> dcm:00080060 "XA" .
I tend to use the latter form as it cuts down on the number of triples and makes for better query performance in the Virtuoso triple store.
All of this is done without a corresponding defined ontology and I would like to rectify this.  My preference is to see an official DICOM conversion but I don't know if I am alone in this endeavour.  OWL is good, but it would also be helpful to have a SHACL equivalent for RDF data validation.
   - Erich
==========================================================
Erich Bremer, M.Sc.
Director, Applied Informatics
Department of Biomedical Informatics
Stony Brook Medicine
Tel. : 1-631-444-3560
Fax  : 1-631-444-8873
Cell : 1-631-681-6228
erich.bremer@stonybrook.edu<mailto:erich.bremer@stonybrook.edu> <mailto:erich.bremer@stonybrook.edu>
Office Location/Mailing Address
HSC, L3: Room 119
Stony Brook, NY 11794-8330
On Mon, Jan 8, 2024 at 10:20 AM Detlef Grittner <detlef.grittner@sohard.de<mailto:detlef.grittner@sohard.de><mailto:detlef.grittner@sohard.de>> wrote:
   __
   Hi Eric,
   first let me apologize for the late answer due to all the holidays
   at the end and beginning of the year.
   Actually there has been a publication of a project where that DICOM
   RDF has been used: https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpubmed.ncbi.nlm.nih.gov%2F25160167%2F&data=05%7C02%7Cphayes%40ihmc.us%7C8bc5d3e64c984e78b69508dc4aa979b5%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C638467338984486270%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=X%2FlYKbH%2BMpQyeHmrJR6LZ659vCoCMR6uwZBVU4felE8%3D&reserved=0<https://pubmed.ncbi.nlm.nih.gov/25160167/>
   <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpubmed.ncbi.nlm.nih.gov%2F25160167%2F&data=05%7C02%7Cphayes%40ihmc.us%7C8bc5d3e64c984e78b69508dc4aa979b5%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C638467338984490806%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=eJ7aguawAOdu7zAsTT0gJRmrf7Vqw8cp99HaeRmxFDY%3D&reserved=0<https://pubmed.ncbi.nlm.nih.gov/25160167/>>
   This DICOM RDF is described in an OWL ontology, but the published
   version on BioPortal
   (https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbioportal.bioontology.org%2Fontologies%2FSEDI&data=05%7C02%7Cphayes%40ihmc.us%7C8bc5d3e64c984e78b69508dc4aa979b5%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C638467338984494545%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=GySFiXPoo%2FUlelo7ohPD7wbM%2B6cdRmSuBKPwXio%2B1no%3D&reserved=0<https://bioportal.bioontology.org/ontologies/SEDI>
   <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbioportal.bioontology.org%2Fontologies%2FSEDI&data=05%7C02%7Cphayes%40ihmc.us%7C8bc5d3e64c984e78b69508dc4aa979b5%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C638467338984498357%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=hkgKeIXgKAzA%2BkP%2BA7bkMR4%2BpuvS8z1WECkM0FBpSsw%3D&reserved=0<https://bioportal.bioontology.org/ontologies/SEDI>>) is completely
   outdated. I will clarify, if I can provide the current version of
   the ontology on that portal and will let you know.
   At the moment there is no lobbying, but I think it would be an
   interesting idea to take that DICOM ontology as a basis for such an
   effort.
   Detlef
   Detlef Grittner
   MSc ISM, M.A.
   Software-Entwicklung
   SOHARD Software GmbH
   Würzburger Str. 197
   90766 Fürth
   Phone: +49 (0) 911 97341-54
   Fax:   +49 (0) 911 97341-10
   E-Mail: detlef.grittner@sohard.de<mailto:detlef.grittner@sohard.de> <mailto:detlef.grittner@sohard.de>
   Geschäftsführer: Peter Feltens, Sebastian Schnitzenbaumer
   Sitz der Gesellschaft: Fürth
   Registergericht: Amtsgericht Fürth; HRB 11478
   On 14.12.23 16:53, Erich Bremer wrote:
   Hi Detlef,

   Is there anywhere I can read about your DICOM RDF work?  I think
   it would be helpful if there was an officially sanctioned RDF
   representation of DICOM.  Is anyone lobbying them with the idea?     - Erich
   ==========================================================
   Erich Bremer, M.Sc.
   Director, Applied Informatics
   Department of Biomedical Informatics
   Stony Brook Medicine
   Tel. : 1-631-444-3560
   Fax  : 1-631-444-8873
   Cell : 1-631-681-6228
   erich.bremer@stonybrook.edu<mailto:erich.bremer@stonybrook.edu> <mailto:erich.bremer@stonybrook.edu>
   Office Location/Mailing Address
   HSC, L3: Room 119
   Stony Brook, NY 11794-8330



   On Mon, Dec 4, 2023 at 1:25 PM Detlef Grittner
   <detlef.grittner@sohard.de<mailto:detlef.grittner@sohard.de> <mailto:detlef.grittner@sohard.de>> wrote:

       Hi all,

       we've been working together with Scott on projects with DICOM
       to RDF conversion. But it is not sanctioned in the sense that
       any organization like w3c or nema has published it as a
       recommendation or standard.

       Anyhow, if you're interested we could explore whether our idea
       of DICOM RDF fits your purpose.

       Kind Regards,


       Detlef Grittner
       MSc ISM, M.A.
       Software-Entwicklung

       SOHARD Software GmbH
       Würzburger Str. 197
       90766 Fürth

       Phone: +49 (0) 911 97341-54
       Fax:   +49 (0) 911 97341-10
       E-Mail: detlef.grittner@sohard.de<mailto:detlef.grittner@sohard.de>
       <mailto:detlef.grittner@sohard.de>

       Geschäftsführer: Peter Feltens, Sebastian Schnitzenbaumer
       Sitz der Gesellschaft: Fürth
       Registergericht: Amtsgericht Fürth; HRB 11478

       On 30.11.23 18:29, Eric Prud'hommeaux wrote:
       Hi Scott, Erich Bremer (Cc'd) is working on a use case that intersects FHIR/RDF and some detail-y bits of DICOM. I'd assumed there was a sanctioned RDF for (all of) DICOM but Erich said there A. wasn't a sanctioned RDF representation for DICOM or B, it didn't include the parts of DICOM that cover his use case. Any leads?

Received on Sunday, 24 March 2024 06:57:29 UTC