W3C home > Mailing lists > Public > public-html-data-tf@w3.org > October 2011

Re: Multiple itemtypes in microdata

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 14 Oct 2011 22:18:52 +0000 (UTC)
To: Bradley Allen <bradley.p.allen@gmail.com>, Jeni Tennison <jeni@jenitennison.com>
cc: Stéphane Corlosquet <scorlosquet@gmail.com>, "public-html-data-tf@w3.org" <public-html-data-tf@w3.org>
Message-ID: <Pine.LNX.4.64.1110142133230.27449@ps20323.dreamhostps.com>
On Thu, 13 Oct 2011, Bradley Allen wrote:
> >
> > Do you have any links to pages where I might learn more about these? 
> > (I tried searching Google for [SWAN scientific hypotheses] and [AO 
> > scholarly documents] but didn't get any useful results. :-( )
> 
> Try these:
> SWAN: http://www.w3.org/TR/hcls-swan/
> AO: http://code.google.com/p/annotation-ontology/
> 
> Check out http://www.alzforum.org/res/adh/default.asp for a discussion 
> of the use of SWAN at www.alzforum.org. Additionally, Paolo Ciccarese's 
> deck on AO talks about tools and integration across a number of 
> ontologies/vocabularies: 
> http://www.slideshare.net/paolociccarese/history-and-overview-of-ao-annotation-ontology 

Thanks.


> An annotation is a statement added to a document post-publication, with 
> the intent of providing commentary or gloss on the original text.

Can you elaborate on how such an annotation would make its way onto the 
page that contains the document? By "post-publication" do you mean after 
the HTML document is put on the Web, or something else?

Microdata doesn't really help with post-publication annotation, since you 
have to be able to edit the document to add microdata.


> But the specific use case is one where someone has provided a statement, 
> in the context of an existing document, that adds an additional 
> statement to it.

I don't really understand what that means. Can you show me an example? (A 
live example of a real case would be ideal.)


> Individual items can have different senses. We can represent these 
> different senses as distinct types. Those types can be obtained from 
> different vocabularies.

I'm not sure we are using the word "item" in the same way here. An "item" 
is just a self-contained group of name-value pairs, such as a particular 
instance of movie metadata, or a particular instance of the description of 
a hypothesis, or some such.


> A research statement is an assertion, for example, of an observation or 
> hypothesis, intended to advance a viewpoint relevant to a line of 
> research.
> 
> Annotations can be research statements, and first-class objects distinct 
> from scholarly documents. Scholarly documents can contain research 
> statements, and scholarly documents can be annotated with annotations. 
> Not all research statements related to a document are annotations.

> I'm going to want subject matter experts to provide me with different 
> microdata vocabularies to cover the different senses that I'd like to 
> capture in my structured data markup. Expecting those all to be provided 
> in a single vocabulary is unrealistic. That's the motivation for 
> supporting multiple itemtypes without the constraint that they all be 
> from the same vocabulary.

You can use multiple vocabularies on one page without any trouble today. 
You would not typically have a single item that uses multiple vocabularies 
in the cases you've described. An instance of an annotation is not also an 
instance of a research statement -- you might annotate a research 
statement, but they are not one and the same. Right?

To put it another way, you could annotate a research statement twice, 
right? And the annotations wouldn't be the same annotation.


> The nature of research communication is changing from being purely 
> focused on print--centric containers of information such as journal 
> article and books chapters (i.e., the traditional notion of a scholarly 
> document), to a much finer-grained representation of statements derived 
> from experimental data that can be aggregated into scholarly documents. 
> The move from print to digital has enabled this new freedom of 
> expression. We must be careful not to constrain ourselves to thinking 
> and working in the old metaphors.

It sounds somewhat like rather than wanting to put hypotheses and 
annotations and so forth in Web pages that are primarily prose, what you 
are describing and what I've seen in the documents you cited above is more 
a database that would be directly filled in, in which case microdata 
really has no bearing on the discussion. You wouldn't want to use 
microdata unless the document you are annotating is primarily prose -- 
articles, book chapters, and the like. If the data is primarily this 
structured information, HTML isn't the right place to put it. It should 
just be put straight into its native form in the database.


On Fri, 14 Oct 2011, Jeni Tennison wrote:
> 
> Could you please clarify what is meant by "same vocabulary" in microdata 
> terms?

A vocabulary is a set of property names, the semantics of those property 
names, the processing rules for properties that use those names, the error 
handling for items that use these properties incorrectly, the meaning of 
the itemid="" value in the context of this vocabulary, potentially the 
"sub"vocabularies of other untyped items that are the values of properties 
whose names are defined by the vocabulary, and the set of one or more 
types that identify that vocabulary.

In MIME terms, a vocabulary is the format, syntax, and semantics of a 
particular format associated with one or more MIME types.

In XML terms, a vocabulary is a namespace. (In XML, a vocabulary only ever 
has one "type".)


> Could I define a microdata vocabulary that included types that were also 
> used in other vocabularies

No, that wouldn't make any sense.

You could define a type that you defined as using the same vocabulary as 
another type, but it wouldn't be a superset. It would have to be the same 
vocabulary.


> for example "http://schema.org/Person" and 
> "http://xmlns.com/foaf/0.1/Person" or do all types within a particular 
> microdata vocabulary have to share a common base URI?

The URL is opaque, so "http://example.org/foo", "mailto:bar@invalid", and 
"uuid:171d010f-9aea-4f4d-af9e-30758eeb221e" could all be types defined to 
use the same vocabulary.


> It would also be really helpful to document the use cases that led you 
> to add support for multiple item types from the same vocabulary to 
> microdata, especially if we can use them as examples.

Yeah, I plan on adding examples soonish.

An example use case is a vocabulary that describes the products of a model 
railway manufacturer. There's one vocabulary that describes all the 
properties, in brief one could imagine something like:

  product-code  - an integer that names the product
  name          - a brief description
  scale         - one of "HO", "1", "Z" after whitespace trimming.
  digital       - if present, indicates the product has a digital decoder;
                  the value gives the type of decoder, one of "Digital",
                  "Delta", and "Systems" after whitespace trimming.
  track-type    - for track-specific products, one of "K", "M", "C",
                  after whitespace trimming.

Now this vocabulary might have several types:

  http://md.example.com/loco - for rolling stock with an engine
  http://md.example.com/passengers - for passenger rolling stock
  http://md.example.com/track - for track pieces
  http://md.example.com/lighting - for equipment with lighting

So a locomotive might be marked up as:

   <dl itemscope itemtype="http://md.example.com/loco 
                           http://md.example.com/lighting">
    <dt>Name:
    <dd itemprop="name">Tank Locomotive (DB 80)
    <dt>Product code:
    <dd itemprop="product-code">33041
    <dt>Scale:
    <dd itemprop="scale">HO
    <dt>Digital:
    <dd itemprop="digital">Delta
   </dl>

A turnout lantern retrofit kit might be marked up as:

   <dl itemscope itemtype="http://md.example.com/track
                           http://md.example.com/lighting">    
    <dt>Name:
    <dd itemprop="name">Turnout Lantern Kit
    <dt>Product code:
    <dd itemprop="product-code">74470
    <dt>Purpose:
    <dd>For retrofitting 2 <span itemprop="track-type">C</span> Track 
    turnouts. <meta itemprop="scale" content="HO">
   </dl>

A passenger car with no lighting might be marked up as:

   <dl itemscope itemtype="http://md.example.com/passengers">
    <dt>Name:
    <dd itemprop="name">Express Train Passenger Car (DB Am 203)
    <dt>Product code:
    <dd itemprop="product-code">8710
    <dt>Scale:
    <dd itemprop="scale">Z
   </dl>

Hm. Having written this, I guess I'll just put that in the spec!

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 14 October 2011 22:22:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 14 October 2011 22:22:53 GMT