W3C home > Mailing lists > Public > public-html-data-tf@w3.org > October 2011

Re: Multiple itemtypes in microdata

From: Bradley Allen <bradley.p.allen@gmail.com>
Date: Thu, 13 Oct 2011 17:19:06 -0700
Message-ID: <CAKpM4LnKvF+TURcbs+PhfjMmVvDQ9=P1PoYrPh5tc4cgu9YJWQ@mail.gmail.com>
To: Ian Hickson <ian@hixie.ch>
Cc: Stéphane Corlosquet <scorlosquet@gmail.com>, public-html-data-tf@w3.org
Hixie- OK, trying this again...

Allow me to try and describe a real, existing use case.

There are a number of ongoing efforts to support the annotation of a
scientific article together with the ability to specify the rhetorical
structure of  a given article. The purpose is to support the
evaluation of a given scientific work to determine whether or not it
is supported by the evidence, and is consistent with related work in
the field.

As a specific example of the motivation for this, consider the
following passage from
http://www.w3.org/TR/2009/NOTE-hcls-swan-20091020/:
 "Developing cures for highly complex diseases, such as
neurodegenerative disorders, requires extensive interdisciplinary
collaboration and exchange of biomedical information in context. Our
ability to exchange such information across sub-specialties today is
limited by the current scientific knowledge ecosystem’s inability to
properly contextualize and integrate data and discourse in
machine-interpretable form. This inherently limits the productivity of
research and the progress toward cures for devastating diseases such
as Alzheimer’s and Parkinson’s."

Vocabularies have been defined to this purpose, and they are gaining
acceptance within the community of workers in the bioinformatics
domain as legitimate ways to express sharable metadata about
scientific publications and statements made within them. Two such
vocabularies are SWAN and AO.

SWAN provides a vocabulary for describing scientific hypotheses; AO
provides a vocabulary for annotation of scholarly documents. They are
distinct vocabularies, developed for distinct purposes. Due to their
highly technical nature, they are
unlikely to be specializations of any meaningful class within
schema.org. Furthermore, tools and workflows have been created to
produce and consume content marked up with these vocabularies, to
provide support for peer review and collaborative research, for
example in the context of communities like the Alzheimer Research
Forum (http://www.alzforum.org).

As a publisher of scientific content, IMO HTML5 with microdata would be a
valuable delivery format for scholarly content marked up with such
structured data. What I would like to do, in that case, is be able to
express the following as something that subject matter expert could
insert into a article about Alzheimer's Disease::

<p itemscope itemtype="http://purl.org/ao/core/Annotation
http://swan.mindinformatics.org/ontologies/1.2/discourse-elements/ResearchStatement">
  Testosterone may play an important role in the prevention of
Alzheimer's Disease (AD) in men.
</p>

(Apologies to you and the ontology authors if I am mangling the
microdata syntax for multiple itemtypes and/or the use of the
vocabularies; but I'm sure you see what I'm trying to say here.)

The content of the <p> tag is both a ResearchStatement and an
Annotation. I am using emerging standard vocabularies that have been
developed for separate purposes in a succinct, clear manner. IMO, that
is the way in which most of the people at the workshop would have
assumed that support for multiple itemtypes would work. Duplicating
the statement in the manner you suggest above would give my editors
fits.

One could, I suppose, simply extend Thing to get them into the same
vocabulary, as in:

http://schema.org/Thing/Annotation
http://schema.org/Thing/ResearchStatement

But none of the tooling built to date would be prepared to consume
those classes without having to be changed to map the newly minted
schema.org classes to their equivalents in SWAN and AO.

Thoughts?

Bradley P. Allen
http://bradleypallen.org
Received on Friday, 14 October 2011 00:19:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 14 October 2011 00:19:45 GMT