Re: The harm that can come if the W3C supports publication of competing specs from Toby Inkster on 2010-01-17 (public-html@w3.org from January 2010)

From: Toby Inkster <tai@g5n.co.uk>
Date: Sun, 17 Jan 2010 19:47:29 +0000
To: Philip Jägenstedt <philipj@opera.com>
Cc: Graham Klyne <GK@ninebynine.org>, Shelley Powers <shelley.just@gmail.com>, HTMLWG WG <public-html@w3.org>
Message-ID: <1263757649.18556.263.camel@ophelia2.g5n.co.uk>
On Sun, 2010-01-17 at 13:25 +0100, Philip Jägenstedt wrote:
> I agree that browsers will be minority consumers of microdata, but also  
> feel that by having a model that is possible to write good DOM APIs for  
> which are possible (desirable rather) to implement in browsers, the  
> metadata becomes much more useful (likely to be used) by others than the  
> traditional semantic web community. Since the RDF model is a graph, it is  
> hard to see how it could be represented using HTMLCollection-like  
> interfaces (you would need a query language for it to be useful) or mapped  
> to JavaScript (you can construct the objects with some effort, but can't  
> serialize them as JSON if the graph has loops).

The RDF/JSON spec <http://n2.talis.com/wiki/RDF_JSON_Specification>
(non-W3C, draft, but quite stable) provides a serialisation of RDF as
JSON. It is capable of serialising arbitrary RDF graphs with (as far as
I'm aware) no exceptions.

It's not an especially pretty format -- intended for ease of
serialisation and parsing rather than human readability -- but it shows
that RDF can be serialised to JSON completely.

It's also worth noting rdfquery <http://code.google.com/p/rdfquery>, an
extension to jQuery which provides an in-browser Javascript API for RDFa
data -- albeit one that's implemented by a script rather than internally
by the browser itself. So RDFa APIs are certainly feasible, and I
believe the people planning the RDFa Working Group plan on standardising
such an API.

> > The mediawiki thread cited by Shelley notes that there is some ambiguity  
> > in the semantics of the microdata presentation, but that's relatively  
> > easily fixed, I think (just ensure the unqualified properties are mapped  
> > implicitly to a full URI, which in turn is described by an RDF schema or  
> > OWL).
> 
> If itemtype is not used, then the data has no semantics outside of the  
> page, and using it is as unsafe as e.g. scraping HTML tables. The RDF  
> extraction algorithm doesn't include untyped items, as it shouldn't. I  
> wouldn't really call this ambiguity, but possibly the spec could be more  
> explicit about this. In the extreme one could even require itemtype to be  
> used, but I think that would harm useful site-private use of microdata.

IIRC, the following has a mapping to RDF:

 <div item>
   <span itemprop="http://purl.org/dc/terms/title">Foo</span>
 </div>

> I would agree, RDF is a well established model and there's nothing much  
> wrong with it. Presently, the only RDF concept I'm aware of that can't be  
> expressed using microdata is XML Schema Datatypes (XSD). I would argue  
> that the datatypes should defined in the vocabulary and not by the author,  
> so I consider this restriction quite sensible.

Cyclical references amongst blank nodes cannot be represented in
Microdata. In Turtle an example might be:

 @prefix foaf: <http://xmlns.com/foaf/0.1/> .
 _:bob foaf:knows _:jon .
 _:jon foaf:knows _:bob .

In RDFa it can be expressed quite simply:

 <p xmlns:knows="http://xmlns.com/foaf/0.1/knows"
    about="_:jon" rel="knows:" rev="knows:" resource="_:bob">
   Jon and Bob know each other.
 </p>

To express the same semantics in Microdata would require assigning a URI
to at least one of the people. Certainly it's possible for a script to
assign a URI on the fly, but committing to maintaining the meaning of
that URI long-term is harder, which is why blank nodes are so frequently
used in RDF.

I believe, this could be addressed by allowing @itemid to contain a
blank node name, and providing a way for @itemprop to specify a blank
node as its value.

> This only seems to matter  
> if you're trying to embed RDF data verbatim which you have no control  
> over, in which case I would argue that you shouldn't bother with either  
> microdata or RDFa and simply link to an external N3/Turtle representation.  
> However, if a use case other than "express arbitrary RDF" requires XSD, it  
> certainly wouldn't be too late to add it to microdata as itemproptype or  
> something. (I would be interested in hearing about such use cases.)

Sorting is a use case for datatypes, albeit not an especially important
one. Sorting as strings, "100" comes before "99"; as numbers, 99 before
100. Sorting as strings, "2010-01-17T19:46:00+0000" comes before
"2010-01-17T19:46:01+0100"; sorting as datetimes, they're the other way
around.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>
Received on Sunday, 17 January 2010 19:48:33 UTC