Linked data, non-human processors and Microdata vocabularies from Tim van Oostrom on 2011-12-13 (public-html-data-tf@w3.org from December 2011)

From: Tim van Oostrom <tim@depulz.nl>
Date: Tue, 13 Dec 2011 11:40:17 +0100
To: HTML Data Task Force WG <public-html-data-tf@w3.org>
Message-ID: <4EE72B91.3010403@depulz.nl>

Hello,

Since 2009 I am actively interested in "the semantic" web and it's 
related technologies.
I've been trying to create a linked data CMS for a while now. It uses 
microdata to give meaning to everything important Item in the system.

I am developing a vocabulary (tool) to be able to re-use html templates, 
scripts, styles etc.
The vocabulary itself is marked up with Microdata too. A bit like 
data-vocabulary.org and the way RDF is structured. Microdata however 
does not provide a (standardized) way/structure of markup to build your 
own vocabulary. Does this mean you can create a vocabulary any which way 
you want? Just, make definitions in text?

Data-vocabulary.org is a sort-of adequately structured version of a 
Microdata vocabulary but schema.org, imo, isn't. A human is able to 
"read" the vocabulary and find semantic meaning but how does a robot 
that has no specific scraper for that vocabulary?

Should there be a standardized way of marking up a Microdata vocabulary? 
For example:

<article itemscope itemtype="http://w3.org/MicrodataVocabulary" 
itemid="http://schema.org">
<h1 itemprop="name">Schema.org</h1>
<dl itemprop="itemDefinition" itemscope 
itemtype="http://w3.org/MicrodataItem" itemid="http://schema.org/Thing">
<dt><dfn itemprop="name">Thing</dfn></dt>
<dd>
<dl itemprop="propertyDefinition" itemscope 
itemtype="http://w3.org/MicrodataProperty">
<dt><dfn itemprop="name">name</dfn></dt>
<dd itemprop="description"> The name of the item.</dd>
</dl>
<dl itemprop="propertyDefinition" itemscope 
itemtype="http://w3.org/MicrodataProperty">
<dt><dfn itemprop="image">image</dfn></dt>
<dd itemprop="description"> URL of an image of the item.</dd>
</dl>
</dd>
</dl>
</article>

When a robot, with a standardized microdata processor, encounters an 
unknown vocabulary it can at least do:

- Show a list of defined properties by http://schema.org/Thing
- Show the description of the image property
- Show all ItemDefinitions of the MicrodataVocabulary http://schema.org

Just like an RDF processor should be able to do.

When there is not going to be a standardized way to structure a 
vocabulary, there should at least be some encouragement to provide the 
vocabulary in other formats (like schema.rdfs.org). When this is the 
case, wouldn't the "linking" of "data", and giving meaning to that 
links, always require involvement of a Human that understands the 
structure of the vocabulary?

Thanks,
Tim van Oostrom

Received on Tuesday, 13 December 2011 10:40:48 UTC