dbooth: Erich added more options to consider, based on DICOM XML conversion
erich: Extra triples matter with large DICOM data. … Everything in DICom is a list, even if the multiplicity is only one. It adds another triples.
dbooth: Even though the Turtle lists look more concise to a human, they don't actually reduce the nubmer of triples from having an explicit index like in option 7
erich: But the Turtle lists allow you to use convenient list tooling.
dbooth: Options 8, 9a and 9b. if that dcm:UNK is in the slot for someone's birthdate, then if you used dcm:UNK for both Sally and Bob, you are asserting that they have the same birthdate.
eric: The use of dcm:UNK costrains you to predicates that have a range that includes both the values you want, and another value … that represents the missing value. … When you have values that are outside the intuitive range of the predicate, then you have changed the range of predicate.
eric: When the range is either a birthdate or 3 null flavors, then your sparql query needs to account for the null flavors.
dbooth: This looks to me like a viable approach. … Other thoughts?
erich: I'm fine either way, with bnodes or not.
detlef: I prefer simpler. … And prefer dcm:UNK, because it's easier to check than "UNK"^^dcm:nullFlavor
eric: If you use the bnode standoff for nulls then you can leverage inference more easily.
eric: What if you have an owl datatype property, and use a bnode as a value. Is that legal?
dbooth: IDK. Need to ask Jim Balhoff
detlef: I think it's not allowed in OWL DL
eric: One axis: IRI vs special literal vs bnode standoff … Another axis: Whether we have an rdf:value standoff.
dbooth: Need to find out about OWL impact
ACTION: DBooth to ask Jim Balhoff about OWL impact of the different options.
erich: Right now we're constructing using dcm:34567 or dcm:theKeyword . Either would be fine, but one is more human readable.
eric: Religious war: Some people think you should use non-readable URIs, so that label can be updated.
erich: In DICOM the keyword names are unchangeable. … therefore they can be used in the IRIs. … and numeric IRI would be owl:sameAs the keyword IRI.
detlef: Out next week. Back in two weeks.
ADJOURNED
rssagent, draft minutes
dbooth: OPTIONS 8, 9a and 9b. if that dcm:UNK is in the slot for someone's birthdate, then if you used dcm:UNK for both Sally and Bob, you are asserting that they have the same birthdate.
eric: The use of dcm:UNK constrains you to predicates that have a range that includes both the values you want, and another value … that represents the missing value. … When you have values that are outside the intuitive range of the predicate, then you have changed the range of predicate.
eric: When the range is either a birthdate or 3 null flavors, then your sparql query needs to account for the null flavors.
dbooth: Okay, so for a birthdateOrNull predicate, when the value is null, it is not actually asserting anything about that person's birthdate. Interesting.
dbooth: This looks to me like a viable approach. … Other thoughts?
erich: I'm fine either way, with bnodes or not.
detlef: I prefer simpler. … And prefer dcm:UNK, because it's easier to check than "UNK"^^dcm:nullFlavor
eric: If you use the bnode standoff for nulls then you can leverage inference more easily.
eric: What if you have an owl datatype property, and use a bnode as a value. Is that legal?
dbooth: IDK. Need to ask Jim Balhoff
detlef: I think it's not allowed in OWL DL
eric: One axis: IRI vs special literal vs bnode standoff … Another axis: Whether we have an rdf:value standoff.
dbooth: If you have an rdf:value standoff then you can skip having an explicit null, because you can just omit the rdf:value triple when it is null.
dbooth: Need to find out about OWL impact of either using a bnode in an otherwise list of primitives/literals; or using something like dicom:null in an otherwise list of literals. You wouldn't be able to say that the dicom:null is owl:differentFrom any of the actual dates, because today you might only know that Sally has a birthdateOrNull of dicom:null, but tomorrow you might find an assertion saying that Sally has a birthdateOrNull of 1990-12-31 . .
. .
CORRECTION ADDED LATER by dbooth: No, that's wrong. It isn't about owl:differentFrom. It's about the multiplicity of the birthdateOrNull predicate. It needs to allow more than one value, so that it can both have a dicom:null value and a 1990-12-31 value, even if it isn't allowed to have two different actual date values.
ACTION: DBooth to ask Jim Balhoff about OWL impact of the different options.
erich: Right now we're constructing using dcm:34567 or dcm:theKeyword . Either would be fine, but one is more human readable.
eric: Religious war: Some people think you should use non-readable URIs, so that labels can be updated.
erich: In DICOM the keyword names are unchangeable, as are the numbers. … therefore they can be used in the IRIs. … and numeric IRI would be owl:sameAs the keyword IRI.
Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).
Diagnostics
Succeeded: s/Options 8, 9a/OPTIONS 8, 9a/
Succeeded: i/Options 8, 9a and 9b/Topic: DICOM
Warning: ‘i/Options 8, 9a and 9b/https://github.com/w3c/hcls-fhir-rdf/issues/141’ interpreted as inserting ‘https://github.com/w3c/hcls-fhir-rdf/issues/141’ before ‘Options 8, 9a and 9b’
Succeeded: i/Options 8, 9a and 9b/https://github.com/w3c/hcls-fhir-rdf/issues/141
Succeeded: i/Options 8, 9a and 9b/dbooth; Erich added more options to consider, based on DICOM XML conversion
Succeeded: i/Options 8, 9a and 9b/erich: Extra triples matter with large DICOM data.
Succeeded: i/Options 8, 9a and 9b/.. Everything in DICom is a list, even if the multiplicity is only one. It adds another triples.
Succeeded: i/Options 8, 9a and 9b/dbooth: Even though the Turtle lists look more concise to a human, they don't actually reduce the nubmer of triples from having an explicit index like in option 7
Succeeded: i/Options 8, 9a and 9b/erich: But the Turtle lists allow you to use convenient list tooling.
David Booth, Erich Bremer, EricP, Gaurav Vaidya, Jim Balhoff
Regrets
-
Chair
-
Scribe
dbooth
Meeting minutes
DICOM
eric: Anytime you have a null, in programming you have to do a dance to get around it, because if Sally and Bob both have null in the birthdate predicate, you have to use a null guard in your code, to exclude the case when the value is a fhir:null sentinal. … .But if you use a bnode, SPARQL won't treat them as the same. So that's a benefit of using a bnode for SPARQL in that case.
jim: bnode seems like a good thing, but I don't think you can have a union of a an object type and a datatype. … We couldn't use a bnode for the null if it's a datatype property.
eric: Because in DL datatype and object properties are disjoint. … We tend to use lists when we want to round trip, and want to use sentinal types, so the transfer representation becomes something we can't reason over. … Because we have RDF lists, we're already outside of what works well in OWL.
jim: Could always use a bnode standoff: [ rdf:value 45 ], [ rdf:value fhir:null ].
eric: Maybe we need a transform for OWL, one optimized for query and one for inference.
eric: Already optimized more for SPARQL.
dbooth: Already going down that path.
dbooth: What do you think would be easiest to process, Erich?
eric: If we had a dataset with 2 graphs g1 and g2, and they shared a bnode, _:b1 , they could be written out, g1 as RDFXML and g2 as Turtle. When you parse them again, you'd tell the parser to unify the _:b1 to reconstruct the orignal dataset.
eric: There isn't a way to indicate which bnodes should be treated as the same (in the same dataset) and which shold be renamed.
erich: CDT could be used to represent all the DICOM list data. Or do it only when literal lists -- a hybrid approach. … If this is adopted, does it replace RDF lists?
eric: If it is, it would take years for it to have traction.
erich: Anything done now cannot do CDT, because it's just a proposal. So we still need a null solution. … Rather work w what we have today. CDT is potential future direction. … But need something that's supported now, as a standard. … DICOM literal lists (whcih may have nulls) are the only problem area.
dbooth: This is reminding me that the ladder of RDF lists is really the wrong way to do lists. Numeric indexes is the right way to do them. … And they should be called arrays, not lists. … .It's longstanding big hole in RDF.
erich: I have a DCM to RDF converter that I've used for a while. But never handled nulls. … Plan to release it as open source, but will leave a placeholder for the null issue.
eric: We care about an RDF rather than having a bnode hidden inside a literal.
ADJOURNED
Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).
David Booth, Detlef Grittner, Erich Bremer, EricP, Jim Balhoff
Regrets
-
Chair
David Booth
Scribe
dbooth
Meeting minutes
DICOM
eric: If you use a sentinal or a bnode, either approach would not work well for OWL. … If you use a sentinal w a string type instead of xsd:datetime (for example) then OWL would be okay, but you would lose the datetime datatype.
jim: Now when we transform RDF lists we just change the property names from the rdf: namespace to soemthng else.
dbooth: RDF list ladders are the wrong way to do arrays. They should have indexes.
eric: SPARQL 1.2 might add mechanisms to get indexes from RDF lists.
jim: Are they doing anything to make property paths more powerful?
eric: IDK
erich: I don't want to break reasoning.
dbooth: FHIR RDF R5 already breaks OWL reasoning, but we have a workaround to convert data to be owl friendly.
detlef: Most of the time I need all of the values of the array, with the ordering.
eric: Some toolkits give a way to return an RDF list as an array.
erich: I access individual list elements in SPARQL. … Implemented parts of geo SPARQL. … Using polygons for filtering in SPARQL. … Data is too much for a single triplestore. Using Apache Arrow with Jena. … Federated SPARQL query, bring together data from different layers. … Code is public: BeakGraph
erich: Named graphs are mapped to Arrow chunks. … Hilbert pyramid … Client side I pull reduced ersolution ploygons. … DICOM 2.2 brigns pathology into it, but no mechanisms for spatial indexing … Can't do performant viewing without indexing.
eric: You're using geosparql?
erich: Yes. I'm generating geosparql to represent the polygon in text form. Then BeakGraph is what the system actually uses.
eric: But you're not using sparql to pick a diagnoal through a cube?
erich: IDK
eric: When you use extension fns in sparql, you can take advantage of your engine donig someting optimal. I don't think you'd be using std sparql to look at diagnoal slices without using extension functions.
erich: You can do simple cube queries, but after that you'll need extensions.
eric: I wouldn't have to be stored as an RDF list.
dbooth: Sounds like we have converged for handling nulls, on using a bnode with a type sentinal. Which type for null flavor?
erich: I recommend we use dicom null flavors.
detlef: Agreed.
detlef: json null is used when a number is missing. … In other contexts, an empty string is used.
dbooth: Do we want to use a separate empty-string-null vs a numeric-null?
eric: If we don't distinguish between them, it might make round tripping harder. … If the spec is complete enough to know the types of all properties, and no property that allows both numeric and string, then we could use that for round tripping.
erich: The dicom:vr is always there to indicate the type … Option 5 may work better for the general case, in dealing w that. … Certain properties are required, so empty string is used if you don't have a value.
eric: Suspect we need to do a survey of the VRs … Could decide what we want as a null standoff. … Then pospone the question of whether to use a single top null type vs two. … Then do a survey on the VR types. If you could do it over, would you use emptry string for null? … And can we use microparsing to reconstruct the empty string or numeric null
ACTION: Erich to look at VR types to see how they map to xsd datatypes
gaurav: No movement yet on HL7 PR. … We cannot update Naming System properly, because that's not yet in THO. But we can update just the code system, so to push through changes … We could start putting prefixes onto code systems (instead of naming systems). … We could then make sure our code works w code systems.
dbooth: Sounds good to me.
eric: Wouldn't be painting ourselves into a corner?
gaurav: I don't think so. We can do one but have to do the naming system later. … My understanding is that a code system has an explicit list of codes. … But Naming System is broader. E.g., US passport holders would be a Naming System because it's continuously changing.
gaurav: Also looking for other candidates for what to target next. Radlex publishes an OWL ont w resource URLs. … To what extent should we push them? It's their official OWL file.
dbooth: I would view the OWL files as defacto spec for the IRI stem. If they later give better documentation, then we can use point to the new doc instead.
DICOM
erich: The DICOM JSON has a datatype that won't necessarily work: SV is "Signed binary integer 64 bits long" but JSON wouldn't be able to handle it.
erich: In DICOM we only need a dicom:null, not null flavors.
<dbooth> detlef: Should we use xsd datatypes or DICOM types?
<dbooth> .... Talked w colleague who came up with option 9. He likes dicom datatypes very much.
<dbooth> ... You could have custom types that are subtypes of xsd:string
<dbooth> ... They behave like string.
<dbooth> ... If they are actually numbers, you have to cast them to add them.
<dbooth> .. But the backend uses SQL92
<dbooth> ... Could he create a schema to make to XML
<dbooth> erich: i like dicom datatypes, because it keeps aligned w dicom.
<dbooth> ... But xsd are better for SPARQL
<dbooth> ... Also times need different cases converted to xsd.
<dbooth> ... XSD types make it easier to use in SPARQL, though the custom types may be better for data transfer to others.
<dbooth> detlef: Some want to interpret the data without the schema. Maybe have VR optionally included?
<dbooth> erich: Like about 5: Similar to DICOM JSON. But don't like that it creates a lot of extra triples i don't use. 9c makes efficient sparql. I'd favor 9c and 10.
detlef: Problematic: There are a few instances in which the byte order is determined by the VR, and would not be round trippable back from XSD to DICOM without knowing the original VR.
erich: What about canonicalizing the endianness in converting to RDF?
dbooth: I like that idea
erich: That would enable XSD types throughout … Then the VR type wouldn't be needed in the data. But still keep it in the ontology.
detlef: What about private data elements? … No ontology for them
dbooth: how about keeping the VR type when private data elements?
detlef: They have a complicated encoding.
dbooth: Agree with Option 10?
AGREED: Choosing option 10 for official DICOM tags
AGREED: And standardize endianness during conversion to RDF
ACTION: Erich to create issue for private data elements
erich: What about the <> ? We should say something about what that IRI should be. … I use the file URI for it.
ACTION: Erich to create an issue for what IRI to use instead of <> in option 10.
David Booth, Detlef Grittner, Erich Bremer, EricP, Gaurav Vaidya
Regrets
-
Chair
David Booth
Scribe
dbooth
Meeting minutes
DICOM
DBooth: Could have optional VR for all elements, but require the VR if needed to disambiguous.
ericP: Could push the private stuff off into another bit of graph, saying that the value of the prop is a tuple, that is the type and the list of values (w/o standoffs). That would factor out the types. … But can only do that w homogeneous lists. … or go directly into the list from the prop, but also saying which elements are which types. Stating that as a separate statement about the property. … That would assume homogeneity thruoughout the doc.
detlef: I think items in a list must have the same VR.
erich: In DICOM lists they are homogeneous.
detlef: There's a different construct for sequences, that can be heterogeneous.
detlef: Property range cannot change within a document.
ericP: Goal of factoring is to make the queries easier.
dbooth: Should we make VRs optional for properties that can be uniquely determined from the schema, and required for those that need disambiguation?
erich: one of the first things I need was to move everything to native xsd types.
dbooth: For transport, if the VRs are kept, but then discarded locally when used (if not needed), would that be a good approach?
detlef: That's what's normally done.
ericP: If you have the optional VR triples in a separate file, then you can just concatenate files if you want them.
dbooth: How would someone partition their triples that way?
ericP: They'd already have to know which private data elements they're using.
erich: DICOM versions change. Need to retain the version in RDF.
dbooth: What about including all the VRs in the transport format RDF, but local users can toss out VRs they don't need.
ericP: Beginning of the instance data could say what schema instance (with version) they are using.
dbooth: But what about having the instance data (for transport) include all the VR codes, with a standoff, and then local user could discard the VR codes they don't need?
erich: Would be a simple sparql query to strip off the VR code and extra standoff.
gaurav: Noticed DICOM discussion on zulip. Trying to figure out how to map DICOM element codes to URLs.
erich: THere was some RDF and/or IRI work done previously, that still seems to be maintained.
The comment in the official RDF DICOM OWL file, "DICOM PS3.16 DCMR Annex D DICOM Controlled Terminology Definitions; converted by "extractdcmdefinitionsasowl.xsl"
detlef: Last week we discussed whether we should use RDF lists. Should we make an issue for it? Erich: Yes.
ACTION: Detlef to make an issue on whether to use RDF lists in DICOM
erich: Most of the poly stuff maps into a well-known text string (using the latest), but that isn't necessarily performant. … Nor would binary necessarily be performant, because nothing is indexed.
erich: Torn between wanting it performat vs easier for exchange.
dbooth: I would lean toward ease of exchange, for standards purposes, since different use cases will usually need to tune their own data for performance anyway.
erich: How far do we want to pull DICOm into the RDF world?
dbooth: channeling EricP, I think he would argue for pulling toward RDF.
detlef: Web services don't look like DICOM at all. … One advantage of RDF is that you have not only transport format, but a format you can store and merge. … Should be a goal. … We had a problem of how to make private data elements unique, so they don't clash. … For ER diagrams, they require that the elements be consistent acros multiple files. … If you merge them, they all need to go into one instance. … There's one concept that has a hierarchical structure. E.g., if you get a DVD. … One is directory of what is on that disc. … They use a hierarchical structure for that. … DICOM should be familiar concept. But hard to transport hierarchy in RDF. … In RDF we could do it with OIDs.
erich: THe OID could be the DICOM OID instance, with refers to the file you're trying to represent.
detlef: A typical DICOM also contains others, at least 3 OIDs.
erich: I went that route with COVID images, and it went nicely. … You can reconstitute the hierarchy simply by merging the data.
erich: In the SPARQL WG, someone mentioned undef in a list. Should the SPARQL group own the whole undef problem.
David Booth, Detlef Grittner, Erich Bremer, EricP (last 15 minutes), Jim Balhoff
Regrets
-
Chair
David Booth
Scribe
dbooth
Meeting minutes
Next week
dbooth: I'll be out. Erich Bremer offered to chair.
DICOM
erich: Discussion about etting a namespace for mapping DICOM enums to URIs. Grahame enquired from David Clooney.
detlef: There's an ont that defines thes URIs, though a few were not included. … Can use the official DICOM namespace? Not possible. … Some are describing units. … Another list is of clinical terms. … E.g., "US" means two different things, so they cannot use the same URI. … Need to stick to what DICOM has defined. If they defined URI we can use them, otherwise we use a string or your own URIs. … A few people think that David Clooney is right, and they'll have a hard time refactoring what DICOM has provided. … Clooney thinks the enums cannot just be turned into URIs.
erich: How much do we RDF-ize things?
erich: RDF work that was done on DICOM seems to be that one section, and then they stopped.
ACTION: Erich to email DBooth DICOM ont link, and DBooth to intro Dave Beckett
erich: Also David Clooney thinks if there isn't DICOM community push, nothing will happen. … Clooney shared his code w me. … I'm all for pushing this forward. … There was a group in California doing early DICOM work, and David Clooney was a part of it, but it stopped. … DICOM ont on bioportal keeps getting downloaded.
detlef: RDF lists are a problem in SPARQL for retaining the order of polygon segments.
erich: But they could now be represented as well-known text strings. … But they can't be indexed well.
erich: There's a binary version of well-known text, more array-like. … And could be sent in RDF. … But the problem: Unless your triplestore knows how to do spatial indexing, you want get the performance. … I developed my own way of dealing w things. I sit on an image repo committee. Got invited w NIH person, and David Clooney sits onthese mtgs. … My point: Although nice to do polygons in DICOM, they didn't think about spatial indexing. I convert them to Hilbert polygons, which allows my SPARQL to go quickly. … Invited to talk about the spatial indexing problem.
dbooth: DICOM should deal w that.
erich: Would love a simple transfer syntax, but that won't be sufficient for efficient query.
erich: Re polygons, how about using well-known text?
detlef: It's an option.
erich: There is intrest in the geo SPARQL group. … Jena and virtuoso have spatial indexing. Well-known text might be a better choice than other things. … There's also well-known binary. … Open Spatial added curved polygon. … Seems like there is a well-known rep of each of the polygon types. … Something like HDT binary RDF, when you store binhex strings, it stores it as a string, not binary. … Will systems actually take advantage of the binary?
detlef: What if it's bitmaps instead of polygons? Lots are just huge blobs. … Gray scale values.
erich: Is it in RDF as a string, or as a file reference?
detlef: I prefer external http reference. … A JSON format offers both. … That one uses binary data. What's the point of storing it as a string in RDF? … The point of RDF is to link w other data.
erich: Have you looked at complex datatypes work in RDF? Might be a way to reference it.
detlef: Available yet?
erich: It's a proposal from Amazon group. … I'd focus on today's tech.
erich: Can the sys take advantaage of the data? Does well-known text help? … What about raster data? … But RDF lists will not a useful form for it.
erich: In my processing, want to annotate images and link kthem spatially w other data.
dbooth: image processing is generally very specialized. Most RDF linkage is of the metadata for the image.
eric: Agree. … Can we come up w concrete use cases we care about? I think we don't want the the binary image data in RDF.
erich: Curious what a DICOM server does when asked for raster data?
detlef: I think there's a header for returning binary.
David Booth, Detlef Grittner, Erich Bremer, Jim Balhoff
Regrets
-
Chair
David Booth
Scribe
dbooth
Meeting minutes
DICOM
erich: Virtuoso can key off of the datatype string for polygons, so we don't need to use an RDF list.
erich: Support for geosparql is the most of what we can expect for support. … There's an add-on for jena tdb also. … I use this well-known string approach along with intersecting wth hilbert. … Choice of well-known text means there's already a lot of support.
detlef: Oracle claims support and also StarDog.
erich: GeoSparql work is still going on. … Seems like the best choice to me. … Want to try enumerating which tags contain polygon data, to convert from binary to well-known text. … It's a custom datatype that says it's formatted as a well-known text. … Spatial indexing question will arise if we bring it to the DICOM group. They are now looking at that issue.
erich: Planning on working on that next. … David Clooney said not to use dicom namespace, without a process.
detlef: Sounds like a good approach. Could go forward w that. … Need to convince DICOM group. … But they're looking for an abstract data rep -- could put arbitrary things in. … Electrocardiogram is a point also. … I think that works.
erich: You can go backward also.
detlef: Would be amazing to be able to find max X in a sparql query.
erich: Re FHIR group, they want to RDF-ize some DICOM terms. Why, if you can just use DICOM text strings?
ACTION: DBooth to put Erich in touch w Dave Beckett.
Errors in FHIR RDF example
ACTION: Jim to compare FHIR RDF diffs to see if Tim Prudhomme's PRs are good to merge
jim: SPARQL CDT proposal from Amazon was submitted to the SPARQL group.
ericp: Did some GPT experiments for composing queries given ShEX … ChatGPT … Llama 3.1 didn't do well.
erich: If you run it through Ollama, the context window is tiny.
ericP: I trimmed the shex first, to only the part needed. … But I was able to do some experiments.
erich: I'm using Ollama … I'm testing memory usage. I'm only running 8Billion parameter one. … Not surprised if ChatGPT does better, w a bigger model. … I … I'll see if we can give you access to our bigger machine.
DICOM
erich: Wanted to try adapting code, instead of RDF lists, to geoparql well-known text string. But didn't find sample data. … Looking in my data for sample.
ACTION: Detlef to see if he has some sample data having polygons
erich: Talking w Dave Beckett could help us close out some issues. … They used OWL. Would be good to write OWL for what we're doing
ericP: If you're maintaining the current plan, that would obviate my attempt at a meta issue.
ACTION: Erich Bremer to create issue for writing OWL, and also SHACL. Starting to learn ShEx.
DBooth: What's going to be our process for tying our DICOM RDF work back to FHIR RDF work?
ericp: Might demo that we can move data from one form to another.
dbooth: That would demo the value of RDF in a DICOM context
erich: I'll make my code public.
dbooth: How would DICOM RDF be connected to FHIR RDF data?
erich: FHIR RDF would reference the DICOM?
ericP: ImagingStudy resource?
erich: They're grouped and connected back to a parent Study
detlef: You define a Study, make a request to DICOm, and modality starts a series. … That needs StudyInstance ID. That should be the link.
ericp: Would FHIR also address into these DICOM things, so you're not pointing at the whole blob?
detlef: I don't think that can be done w the old approach. Could only be done w DICOM RDF approach.
ericp: Wondering if StudyInstance can point to something more specific in a DICOM report.
erich: There is stuff in DICOM that allows those annotations. … When spatial indexing is turned on, in Virtuoso, the geosparql well-known text takes advantage of it. … It's the same argument for pushing everything into XSD types, so SPARQL store s can use them. Doing the same thing for spatial data.
ericP: Would be fun to add that to QLever … If you want to point to a piece of a giant scan, to you abstract that portionfirst?
detlef: You can specify delineations … And you can do a structured report, diagnosis in a radiology image. Making statements about it. … These DICOM structures have their own modalities and their own instances.
ericP: Is that a good use case for us to chase?
detlef: I think so.
dbooth: Need to make sure we have the ability to do that kind of referencing in FHIR?
ACTION: ericP to check whether appropriate referencing portions of a DICOM document
erich: It might change depending on the modality … You have xray type, CAT scan type, etc. Then within that you now have pathology, whole site imaging. They're single layer images. As this normalized in RDF, how does it play together?
ericP: And if there's a tension between putting stuff in FHIR vs DICOM … DICOM folks have been pushing for adding pathology info. Annotations are more numerous w whole site images. … David Clooney has a paper on having both TIF and DICOM together. … But the tooling to look at whole slide imaging is immature. … TCI group is very int in DICOM and RDF. … Cancer Imaging Warehouse -- lots of dde-iDed data available to use. … I'm nudging them toward RDF, to deal with disparate datasets. … DICOM meets continuously, and releases a new version every quarter.
David Booth, Detlef Grittner, Erich Bremer, EricP, Jim Balhoff
Regrets
-
Chair
David Booth
Scribe
dbooth
Meeting minutes
DICOM
erich: Snippet of converted file: w3c/hcls-fhir-rdf#149 … Will test this on a bunch of data
dbooth: they have the datatype indicated, like "POLYGON .... ))"^^geo:wktLiteral … How easy is it to convert from DICOM JSON to this RDF format?
erich: I convert the files directly -- similar to DICOM JSON, but conversion to xsd literals. … Also doing SPARQL Updates on the models I have in memory. Wrote my own magic property in jena to do it. … My program has an option called LongForm, which does bare minimum conversion to RDF
detlef: THe conversion is pretty simple. … Using an ontology, you know it is a polygon and you can use that format.
ericp: I'm down with that.
erich: Now converting 300k files. Will load them into virtuoso and see if this data element is in there. … About 20B triples. … And I'll enable geosparql and see if it really works. … Mostly concerned about pathology, 1-2M polygons in one image. … Lots of labeling w deep learning pipeline. Want see all tumor infiltrating sites within x distance of something.
Want to see that across the entire collection, for the pathologist to look at it. That will exercise the index.
ericp: Doing Mondo DB stuff, when does it become available for query after loading?
erich: IDK. Using our own sys for annotations and reviewing result sets. … Want to do spatial queries across multiple pipelines -- linking them all together … RDF is a great way to join all that data.
detlef: individual items are descriptions of a region of interest. We wanted to add data there. 3 choices: 1. Add DICOM attribute if there's an official one, and you're altering the original file. Need to track the modifications in another attribute. … Option 2. LIke relation DB approach, foreign linking. Ref the number of that region in the other file. … Option 3. Direct linking approach, that requires a different structure -- not an RDF list.
dbooth: Direct linking == by a URI? Detlef: Yes.
detlef: Cannot do that w bnodes in an RDF list.
erich: Could use URIs in the RDF list. … Skolomized, or URNs for example
ericP: How do JSON users access these things? Just go to the nth element?
Connecting DICOM RDF to FHIR RDF
dbooth: How do we write this up in the context of FHIR RDF work?
erich: Will see what work is already being doing on DICOM in FHIR
dbooth: I haven't heard back from Dave Beckett.
erich: Reached out to Sean Bechhofer. Haven't heard back yet.
FHIR RDF update for HL7
dbooth: Need to do a semi-annual update to the ITS group
ACTION: EricP to run shex validation against new FHIR RDF examples
jim: I'll work on it in the next 6 weeks.
DICOM
erich: Played with Virtuoso geosparql. Started throwing errors when I tried my data. THink it is related to default coord sys. … Their flavor of well-known text is extended, but nart part of the standard. … I reached out to address that. … Didn't see Blazegraph support, but it's in the code. … Updated issues 149. … Trying to demostrate that it works in practice.
rob: Suggest Kevin ODonnel, int in DICOM … About representations of imaging
eric: Do people mostly address these things by nth element of an array? … Is that good enough? … How do people make references to these things?
erich: In DICOM JSON it will be xy coordicates in an array. … But when you put it into an RDF list it isn't performant.
ericp: Depends on language binding. … In n3 I can look at it as a first-rest ladder. Or I can get it as an array in JSON. … If I use a python lib, I could ask for the RDF graph, but I can get the lists in a special object for lists, which allow indexing.
ericp: Two issues: 1. can you efficiently get to nth element of a list. 2. Whether you have geosparql indexes. … The reason RDF is not performat is because there is no hint to use geosparql indexes on it.
erich: They've added well-known text. Could add support for RDF lists.
ericp: Would be interesting to add that feature to communica.
erich: If it was added to geosparql it would help, but would need to be added to jena, etc.
jim: I'm currently adding some features to Communica
ericp: Dependency injection is complex
jim: I added the path querying from stardog to Communica, but tyring to figure out how to deploy it.
jim: It was created as a framework for research -- very modular. … Also geared toward dynamic queries. … Can access other SPARQL servers, files, etc. all at once, and does the joining on client side.
dbooth: Federation framework? Jim: yes
detlef: I like the simplicity of well-known text. … Doesn't depend on other features. … I can literally grap the text and parse it in JS. … If you have geosparql plugged in then you have more possibilities.
erich: Might be good to suggest RDF list support to future version, but need to use current stds now.
Diffs of FHIR Examples from Tim Prudhomme's PR
Ericp: line 266 in new (empty fhir:binding ) is not in the old.
[DBooth opinion: they were verbose and not very useful, IMO]
Concept IRIS
Gaurav: HL7 group approved the MeSH IRI Stem. They closed the existing issue and made a new issue for implementing it. Next working on an IRI stem proposal that will bundle several IRI stem proposals into one consolidated proposal.
AGREED: We'll continue working on DICOM RDF, and bring it to the DICOM org when we feel like it is mature enough. Hopefully then an official DICOM RDF group could be created, and they could agree to allocate a DICOM IRI stem.
<scribe> ACTION: Eric Bremer to find out what other HL7 groups are doing regarding connecting DICOM to FHIR before the November 6th meeting.
DICOM issue 152: Handling malformed DICOM files
https: //github.com/w3c/hcls-fhir-rdf/issues/152
AGREED: Closing as out of scope
Upcoming ITS meeting
I'll be giving an update on our FHIR RDF work to our sponsoring supergroup at HL7, ITS, at 3pm Nov 6. All are invited to join if available, though not crucial.
<scribe> ACTION: David to prepare slides summarizing the group's current work for the upcoming HL7 ITS meeting on November 6th.
[NEW]ACTION: David to prepare slides summarizing the group's current work for the upcoming HL7 ITS meeting on November 6th. [NEW]ACTION: Eric Bremer to find out what other HL7 groups are doing regarding connecting DICOM to FHIR before the November 6th meeting. [NEW]ACTION: Eric Prud'hommeaux to share tutorials on FHIR and RDF at the next meeting.
[End of minutes]
Minutes formatted by David Booth's
scribe.perl version 1.133 (CVS log)
$Date: 2008-01-18 18:48:51 $
jim: I'll have time to work on this after Nov 15, after ISWC in Baltimore. … But only going to the workshop part.
tim: Wnat our changes to be in R6. Looks like not many people are using R5, but they expect R6 to be in use a long time. … US Core Implementation Guide will skip R5.
jim: In the prov ont, prov:wasDerivedFrom is an object property … So this would involve punning. … But it's okay if you import prov, or declare the type of this property. If you open this in protege, you wouldn't see this relation..
tim: Maybe use a dublin core property instead?
tim: How about rdfs:isDefinedBy ?
dbooth: Sounds appropriate to me.
gaurav: In the interest of more flexibility, I prefer dc;source, but ok w rdfs:isDefinedBy also
AGREED: Change from prov:wasDerivedFrom to rdfs:isDefinedBy
gaurav: Grahame disagrees with the proposed change. Wait and see.
... On vac the next 4 weeks.
... Back 5 wks from now.
DICOM
erich: Spoke to Lawrence Tarbucks (sp?) who is also in WG20 (DICOM)
... That's the group that is running conjointly w the FHIR imaging group.
... This is a long-standing friendly relationship. He made multiple suggestions for getting DICOM to take this on.
... DICOM likes things done in certain ways. Suggested presenting to WG20, though it might move to another group after.
... i.e., the strategic WG.
... They aren't keen on receiving a complete solution, but want to be involve in the process.
... Get their buy-in sooner.
... Wants me to do something before Nov 11, to speak about it.
... I'll contact the secretary for WG20. Their minutes are not up to date.
dbooth: Sounds like a great development. Suggest preparing only one slide, to start the conversation.
erich: I think my dept is a member. Want to make a good first impression, but very short notice and busy the next few days.
... Want to get a critical mass on it.
dbooth: If not Nov 11, when would be next opportunity?
erich: Prob after the holidays. ALso want to do more recon in advance.
... I know Larry already.
... He's co-chair for WG23, which is AI group
detlef: What would convince them to adopt RDF?
erich: I've been hearing that end users spend a lot of time homogenizing and ETL-ing data before they can use the data. There's an effort for common vocabs. This is a use case for RDF.
... You want to connect the information to clinical info. how? Allow the data to connect.
detlef: If you want to build out a KG you'll need a lot of converters.
... RDF would help that.
... In my case, linking ICD10 codes
erich: Also, when people publish anything, how do you find the value? Look at citations. Want to include the publication data.
... To link it all together, there's so much effort in connecting it.
... In RDF you can bring that data together a lot faster.
detlef: Also, their data is hierarchical, which doesn't fit well with relational.
... And you can link directly from one doc to another.
... RDF is much better choice than JSON, for example.
erich: Gen vast amounts of data from deep learning pipelines. You can try to put that in JSON in MongoDB, but it's tricky to deal with at that scale.
... Want to pull out features at this scale, and search it. That's my argument for geosparql.
... Good like w trying to do this w MongoDB!
... THrowing it all in a triplestore is hard at scale.
... YOu end up with n log n speed problem.
... I split it apart into independently indexed files. But RDF allows me to link it back together quickly.
... Wouldn't try to attempt this w a relational DB.
... I'm implementing it.
... When DICOM added polygon features, but the left out spacial indexing.
... A POC is not enough. Need to handle it at scale.
... This is why DICOM should use RDF: allows all the pieces to be brought together.
DICOM arrays
https: //github.com/w3c/hcls-fhir-rdf/issues/149
erich: I think geospatial is the way to go.
... There's "grid spacing", but no clear guidance on what the grid is
... Sometimes the x and y spacing differs.
... They're proposing an RDF way to specify a spacing. I want to be sure that x and y spacing can differ.
... E.g. 0.43 microns per pixel.
... Still trying to figure out how to do that.
... Virtuoso is still at geosparql 1.0, and Jena is also.
CURIEs in codes
https: //github.com/w3c/hcls-fhir-rdf/issues/127
dbooth: Discussed w ITS group yesterday, and they agreed to say "don't put the prefix in the code", but pushed back on also saying "but accept the prefix if it's in there".
... So I took out the part about "but accept the prefix if it's in there".
... The ITS group voted to accept the amended proposal.
erich: Connected with Lawrence Tarbucks in DICOM group. … When you want to connect DICOM to other data, then RDF is helpful. … Want to understand the DICOM's working group more, before asking to do an RDF DICOM. … Should have an RDF OMOP.
ericp: I think there was a project to make a mapping from FHIR to OMOP, defined in spreadsheets.
rob: Also one of the HL7 Imaging Integration WG co-chairs is Brian Bialecki
Kevin Donnelly isn’t a co-chair (at least not currently), but he is definitely active with HL7 and DICOM and he is the main one that I’ve been in contact with (but I have been on some meetings also with Brian).
erich: Would be nice to have DICOM JSON-LD.
Polymorphic properties
dbooth: This is a non-substantive change, right?
AGREED: Non-substantive change, just adding more examples.
ericp: We don't need the "A great deal of effort has gone into unifying FHIR property names across resources and datatypes. " sentence. … Suggest verbiage: FHIR R5 introduced unified property names across resources. … Also suggest changing wording to say "reusable and polymphic".
AGREED: To dbooth's proposed wording, subject to the above editorial changes.
ADJOURNED
Minutes manually created (not a transcript), formatted by scribe.perl version 238 (Fri Oct 18 20:51:13 2024 UTC).
detlef: Suggest having instance IRI in the file. … The DICOM dir has relative path of files. … If we had a URI for each entity, then you can merge them easily. … I want this ability, regardless of what we use for subject URI.
erich: I like this idea … of the DICOM directory
ACTION: Detlef to share an example in the issue
erich: dcm3j has the ability to do the DICOM dir … Keeping metadata on the files, what to think how to use linked web storage … If there's a concept of a DICOM dir, it might well w linked web storage. … When you do a GET on SOLID storage for a particular container, I see DICOM files, and I GET those … And it could give the RDF conversion of the image as an enrichment.
ericp: You conneg to RDF
erich: If there's an official RDF DICOM representation, then linked web storage gives a way to get to it.
ACTION: Erich to look into DICOM directories
How best to represent DICOM lists with missing elements, in FHIR RDF?