The scope and mission is determined by one of the goals of
the Healthcare and Life Sciences Special Interest Group (HCLSIG) specified in
the charter, viz., “Core vocabularies and ontologies
to support cross-community data integration and collaborative efforts”. The
main thrust of activities will be around the theme of coming up with best
practices that revolve around the definition, creation, evaluation and
maintenance of ontologies in the context of well defined use cases that are
likely to be of interest to the broader HCLSIG community. Towards this end,
this group will collaborate with other working groups within HCLSIG and the
A set of use cases exemplifying the vision of the bench to
the bedside will be specified. A carefully selected subset of these use cases
will form the context for answering a set of questions that are likely to arise
in the minds of a healthcare practitioner or a life science and clinical
researcher as he she attempts to use ontologies and semantic web specifications
to address his information and knowledge needs. These questions are:
1.
What is an ontology? A very pragmatic definition which
encompasses among other things, terminologies (such as Snomed, GO) and
information models (such as HL7 RIM). A working definition with guidelines and
examples from various healthcare and life science applications need to be
developed. The current definition of an ontology as enunciated by the W3C needs
to be examined and extended if required. Ontology as a model of use needs to be
emphasized in contrast to ontology as a model of meaning. The strategy will be
to assimilate current “ontology-like” artifacts and extend them to create
OWL-DL ontologies demonstrating with use case examples the value achieved in
doing so.
2.
What information should be
represented in an ontology? The various knowledge artifacts that could be represented using
ontology-like artifacts need to be enumerated. Candidate representations of
these artifacts could be terminologies such as Snomed and Gene Ontology,
various Genomic artifacts such as Genes, Variants, Proteins and various
clinical artifacts such as Clinical Documentation templates and Clinical
Decision Support Rules. Ontologies can also be used to encode processes and
process models related to biological pathways, clinical care protocols,
clinical guidelines and web services annotation models. Other artifacts that need
to be designed and represented in an ontology could be namespaces, mappings of
ontological elements to underlying database schemas and other data structures
and mappings across various identifier and value sets. Provenance information
about a knowledge artifact such as “who”, “what”. “when”, etc.; versioning and
history information and information about content dependencies could also be
captured in an ontology.
3.
How should information be
represented in an ontology? Various candidate possibilities of representing information
and knowledge in an ontology are based on available standards such as RDF, OWL
and SWRL. A set of best practice guidelines needs to be identified for
knowledge representation. Furthermore the need for representing probabilistic
knowledge is also crucial in the HCLS areas. Some sources of uncertainty
include: uncertainty in data (e.g., uncertainty in genotyping data from the
affymetrix chip), uncertainty in evidence, uncertainty in hypotheses, and
quality/trust judgements (e.g., I trust HCM test results more from lab X then
from lab Y). Current standards (RDF/OWL/SWRL) need to be investigated whether
these requirements can be supported or the HCLSIG should propose some OWL/RDF
extensions.
4.
How could ontologies be created? Collaborative approaches to develop
ontologies with the involvement of subject matter experts, information
architects and modelers and various application consumers (geneticists, clinicians).
The ontology created could be a by product of performing a daily task (e.g.,
reporting on results of gene tests) and should have an immediate value (e.g.,
reporting templates). For instance, in the life sciences domain, the processes
of annotating data should be interleaved with the processes of creating the
ontology. In general, ontologies have been created as part of a social process
involving a community effort or interested experts; or a process in which
schema is designed by humans but instances/population is carried by automated
and semi-automated techniques, and by automated means through corpus analysis
and subsequent curation. We may want to
identify successful cases of each of these. Distributed ontology development
encourages participation of domain experts. The resulting ontologies more
accurately reflect rich, well-contextualized knowledge, but this also increases
the challenge of global interoperability. This group should identify strategies
for ontology federation, including web-friendly mechanisms for cross-ontology
mapping, inferencing in the face of incomplete consistency, and distributed or
modular reasoning. A set of building blocks and templates for ontology building
that are specific to HCLS areas should be identified.
5. How should ontologies be accesses and used? Standards for accessing and
retrieiving ontological information may need to be identified. There are
efforts in the healthcare informatics community to define web service standards
for accessing and manipulating terminological concepts. The suitability of this
standard for requirements of the HCLS areas could be examined and extensions
could be proposed. Ontology-based inference functionality that checks for
ontology consistency and subsumption knowledge
6.
How should ontologies be maintained? Knowledge change and evolution is a
key issue in the HCLS areas. Especially there is a need for the use of old data
against a new ontology and the use of new data against an ontology. As an
ontology evolves so do the mappings of that ontology to the underlying database
schemas. Issues such as versioning,
history and diffs, provenance, dependency propagation and ontology lifecycles
are of critical importance in the HCLS areas.
7.
How should ontologies be evaluated? Ontologies can be evaluated using
general principles of sound ontology design from the Knowledge Representation
literature and taxonomy design principles from the Library Sciences. Issues
such as the quality of ontologies depend on the evaluation of their content and
their performance in an application context. These issues will become
increasingly important as ontologies are increasingly used in the HCLS areas.
BWG =
BioRDF (Structured Data to RDF)
T2S = Text
to Structured Data
KLWG =
Knowledge LifeCycle Working Group
OWG =
Ontology Best Practices Working Group
APWG =
Adaptive Healthcare Protocols and Pathways Working Group
RWG = ROI
Analysis Working Group
NCBO =
SWBP =
Semantic Web Best Practices and Deployment Working Group
Task
|
OWG
|
NCBO
|
SWBP
|
BWG
|
APWG
|
KLWG
|
Use Case Document
|
Co-Lead
|
|
|
Co-Lead
|
Contributor
|
Contributor
|
Ontology
Definition and Best Practices Whitepaper
|
Lead
|
Contributor
|
Contributor
|
|
Contributor
|
|
Ontology
Access and Usage Practices White Paper
|
Lead
|
Contributor
|
|
Contributor
|
Contributor
|
|
Ontology
Development Wikis
|
Co-Lead
|
Co-Lead
(BioPortal)
|
|
|
|
|
Ontology
Maintenance Report
|
Contributor
|
Lead
|
Contributor
|
|
|
|
Ontology
Evaluation Report
|
Contributor
|
Lead
|
Contributor
|
|
|
|
Use Case
Solution Design/Prototype
|
Co-Lead
|
Contributor
|
|
Co-Lead
|
|
|
1.
Use Case Document: 3 months – in consultation and collaboration with the other working
groups as illustrated above:
Task Coordinators: Vipul Kashyap, Mollie Ullman-Cullere
Collaborating Task Coordinators: Susie Stephens?, Helen Chen?
2.
Ontology Definition and Best
Practices Whitepaper:
1 year – in collaboration with the
groups illustrated above This white paper will survey all “ontology-like”
artifacts that are currently being used in the HCLS areas and illustrate in the
context of use case examples the value of extending them into OWL-DL ontologies
This report will also identify best practices of representing them with
proposed extensions to current standards. A set of current ontology fragments
such as Snomed, MedRA and GO will be represented using these standards in the
context of the use cases and insights from the experiences will be presented.
In particular a set of pragmatic building blocks for the use cases at hand will
be proposed.
Task Coordinators: Alfredo Morales, Wangxiao, Robert Stevens
Collaborating Task Coordinators: Alan Rector?, Helen Chen?
3. Ontology Access and Usage Best Practices Report 6 months This report will survey current
practices including standardized APIs and service interfaces for accessing and
manipulating ontologies currently in use in the HCLS areas and propose possible
extensions to the same.
Task Coordinators: Duncan Hall, Ray Hookaway
Collaborating Task Coordinators:
4. Creation of Wikis for Ontology Development: 6 months – in collaboration with the NCBO and other bodies such as Snomed
and OBO. These wikis will bring
together a set of subject matter experts, information architects and modelers
from various HCLS areas in a open and self-organizing manner to create
ontologies, information models and other knowledge artifacts.
Task Coordinators: John Madden, Wangxiao
Collaborating Task Coordinators:
5. Ontology Maintenance Report: 18 months – The NCBO is in the process of developing techniques and best practices
for creating ontologies. This report will describe an application of these techniques
and best practices to the HCLS areas. The NCBO
will take the lead in achieving this deliverable with feedback from the members
of HCLSIG.
Task Coordinators:
Collaborating Task Coordinators: Vinay Chaudhri, Alan Rector?
6. Ontology Evaluation Report: 18 months - The NCBO is in the process of developing tools and techniques for
evaluating ontologies. This report will describe an application of these
techniques and best practices to the HCLS areas. In particular, the HCLSIG
members could participate in the ontology evaluation network proposed by the NCBO. The NCBO will take the
lead in achieving this deliverable with feedback from the members of HCLSIG.
Task Coordinators:
Collaborating Task Coordinators: Amit Sheth, Alan Rector?
7.
Solution Design for a particular use
Case: 2 years – The
solution design would involve activities such as conversion of a subset of
pre-existing ontologies/vocabularies such as Snomed, GO and MedRA into the OWL
standard, creation of mappings of the subset ontology to well knownd databases
such as GeneBank, Swiss Prot and some clinical databanks. Queries will be
designed against the ontologies and the examples of the execution and final
results will be presented.
Task Coordinators: Vipul Kashyap, John Madden, Alfredo
Morales
Collaborating Task Coordinators: Susie Stephens?