- From: Paolo Ciccarese <paolo.ciccarese@gmail.com>
- Date: Fri, 14 Oct 2011 15:34:13 -0400
- To: public-html-data-tf@w3.org
- Message-ID: <CAFPX2kBXirAPM2eBi052sbd5VbO+asdQpoww_iQPGHPQBdv=gw@mail.gmail.com>
Hixie- Responses inline. > > > On Thursday, October 13, 2011, Ian Hickson <ian@hixie.ch> wrote: > > On Thu, 13 Oct 2011, Bradley Allen wrote: > >> > >> SWAN provides a vocabulary for describing scientific hypotheses; AO > >> provides a vocabulary for annotation of scholarly documents. > > > > Do you have any links to pages where I might learn more about these? (I > > tried searching Google for [SWAN scientific hypotheses] and [AO scholarly > > documents] but didn't get any useful results. :-( ) > > Try these: > SWAN: http://www.w3.org/TR/hcls-swan/ > AO: http://code.google.com/p/annotation-ontology/ > > > > > >> They are distinct vocabularies, developed for distinct purposes. Due to > >> their highly technical nature, they are unlikely to be specializations > >> of any meaningful class within schema.org. > > > > Yes, I wouldn't expect schema.org to have any relevance to this > particular > > use case. > > > > Agreed. > > > > > >> Furthermore, tools and workflows have been created to produce and > >> consume content marked up with these vocabularies, to provide support > >> for peer review and collaborative research, for example in the context > >> of communities like the Alzheimer Research Forum > >> (http://www.alzforum.org). > > > > Do you have any links to documentation about these consuming tools? I > > would love to be able to study them further. > > > > Check out http://www.alzforum.org/res/adh/default.asp for a discussion of > the use of SWAN at www.alzforum.org. Additionally, Paolo Ciccarese's deck > on AO talks about tools and integration across a number of > ontologies/vocabularies: > http://www.slideshare.net/paolociccarese/history-and-overview-of-ao-annotation-ontology. > > > > > > >> As a publisher of scientific content, IMO HTML5 with microdata would be > >> a valuable delivery format for scholarly content marked up with such > >> structured data. What I would like to do, in that case, is be able to > >> express the following as something that subject matter expert could > >> insert into a article about Alzheimer's Disease: > >> > >> <p itemscope itemtype="http://purl.org/ao/core/Annotation > >> > http://swan.mindinformatics.org/ontologies/1.2/discourse-elements/ResearchStatement > "> > >> Testosterone may play an important role in the prevention of > >> Alzheimer's Disease (AD) in men. > >> </p> > >> > >> The content of the <p> tag is both a ResearchStatement and an > >> Annotation. > > > > I assume those two itemtypes are supposed to be examples (they both 404). > > I'm not sure what an "Annotation" is supposed to be here. I would presume > > a research statement is a property of a scholarly document. > > > > An annotation is a statement added to a document post-publication, with the > intent of providing commentary or gloss on the original text. > > A research statement is an assertion, for example, of an observation or > hypothesis, intended to advance a viewpoint relevant to a line of research. > > Annotations can be research statements, and first-class objects distinct > from scholarly documents. Scholarly documents can contain research > statements, and scholarly documents can be annotated with annotations. Not > all research statements related to a document are annotations. > > > > Assuming that there is an HTML page that is a scholarly document and that > > contains scientific hypotheses, you can use microdata today to mark such > a > > page up with no problems, as far as I can tell. No single item would be > > both a scholarly document and a scientific hypothesis; instead you would > > mark them up, something like this: > > > > ... > > <body itemscope itemtype="http://swan.example.org/scholarly-doc"> > > <h1>A study of birds</h1> > > <section> > > <h1>Abstract</h1> > > <p itemprop="abstract">We look at birds and see if they have > > wings or lips.</p> > > </section> > > <section> > > <h1>Hypotheses</h1> > > <p itemscope itemtype="http://example.net/ao/hypothesis"> > > <span itemprop="description">We hypothesise that birds have > > wings.</span> <span itemprop="reason">We base this on the > > circumstancial evidence that birds have wings according to the > > dictionary.</span> > > </p> > > <p itemscope itemtype="http://example.net/ao/hypothesis"> > > We also consider lips. <span itemprop="description">We presume > > that birds have lips.</span> You may ask why we think that. <span > > itemprop="reason">We are just guessing.</span> > > </p> > > </section> > > ... > > </body> > > > > The nature of research communication is changing from being purely focused > on print--centric containers of information such as journal article and > books chapters (i.e., the traditional notion of a scholarly document), to a > much finer-grained representation of statements derived > from experimental data that can be aggregated into scholarly documents. The > move from print to digital has enabled this new freedom of expression. We > must be careful not to constrain ourselves to thinking and working in the > old metaphors. > > > > This is a microdata document with three items. As JSON, this data looks > > like this: > > > > { > > "items": [ > > { > > "type": "http://swan.example.org/scholarly-doc", > > "properties": { > > "abstract": [ > > "We look at birds and see if they have\n wings or lips." > > ] > > } > > }, > > { > > "type": "http://example.net/ao/hypothesis", > > "properties": { > > "description": [ > > "We hypothesise that birds have\n wings." > > ], > > "reason": [ > > "We base this on the\n circumstancial evidence that birds > have wings according to the\n dictionary." > > ] > > } > > }, > > { > > "type": "http://example.net/ao/hypothesis", > > "properties": { > > "description": [ > > "We presume \n that birds have lips." > > ], > > "reason": [ > > "We are just guessing." > > ] > > } > > } > > ] > > } > > > > (See http://goo.gl/OgF8C for a live view of this.) > > > > Useful to see. But the specific use case is one where someone has provided > a statement, in the context of an existing document, that adds > an additional statement to it. > > > > > >> I am using emerging standard vocabularies that have been developed for > >> separate purposes in a succinct, clear manner. IMO, that is the way in > >> which most of the people at the workshop would have assumed that support > >> for multiple itemtypes would work. > > > > > I don't see any reason in this case that any one item would have multiple > > types, but maybe that's because I don't understand the vocabularies in > > question. I would be very interested in studying the vocabularies and the > > software that consumes them. > > > > Individual items can have different senses. We can represent these > different senses as distinct types. Those types can be obtained from > different vocabularies. > There are many use cases in the biomedical domain where multiple types from different vocabularies could be useful. The first one I can think of is very simple. In a scientific publication I can have a paragraph that contains the word 'protein'. I can certainly attribute a type to it. For instance I can pick the term 'Protein' from the PRotein Ontology (PRO): http://purl.obolibrary.org/obo/PR_000000001 . However, I also want to record that that represents also an amino acid sequence with a term provided by the BioTop ontology: http://purl.org/biotop/biotop.owl/AminoAcidSequence. Sure, somehow, one day, you could get machines to assert that a PRO:Protein is a BioTop:AminoAcidSequence (the vice-versa is not always true), however, for many purposes I might still want to declare that explicitly. And most likely this will happen with terms from different vocabularies as most of the ontologies today are modular and might deal with different aspects and different level of granularity. I can think of another example. I can have a text saying "John Doe unfortunately eats lots of bacon. Bacon is not good for people with high cholesterol". In this specific use cases I could say that the second occurrence of 'bacon' is both a 'food' and a 'health risk' and the terms are probably coming from different vocabularies. As I am the user saying that for this context and at a specific point in time, I would probably like to keep those two types together. In the annotation tool named Domeo (screencasts here: http://code.google.com/p/domeo/ ) we already do this. We allow our curators to annotate the same text span with multiple types. The annotation can then be used, after the approval process, to enrich the content of the original page. As the types assertions have the same provenance, we would like to see them in the HTML microdata in the very same item to avoid extra mark-up. With Domeo we can do more complicated things with Annotation and Claims/Hypothesis as well. However, the two examples above are already enough for me to agree with Brad. Multiple types coming from multiple vocabularies make perfect sense to me. > > > > > Incidentally, note that you can't just take, say, an RDF vocabulary, or a > > Microformats vocabulary, and just use it in microdata directly. A > > microdata vocabulary has to define processing rules that are often not > > provided for RDF and Microformats vocabularies, and has to use the terms > > defined in the HTML specification to describe how the terms work. You can > > see examples of how to define vocabularies in the HTML standard: > > > > > http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#mdvocabs > > > > Fair enough. Given that, I'm going to want subject matter experts to > provide me with different microdata vocabularies to cover the different > senses that I'd like to capture in my structured data markup. Expecting > those all to be provided in a single vocabulary is unrealistic. That's the > motivation for supporting multiple itemtypes without the constraint that > they all be from the same vocabulary. > > > > HTH, > > -- > > Ian Hickson U+1047E )\._.,--....,'``. fL > > http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. > > Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' > > > > -- Dr. Paolo Ciccarese http://www.paolociccarese.info/ Biomedical Informatics Research & Development Instructor of Neurology at Harvard Medical School Assistant in Neuroscience at Mass General Hospital
Received on Friday, 14 October 2011 19:36:27 UTC