Re: The harm that can come if the W3C supports publication of competing specs from Graham Klyne on 2010-01-17 (public-html@w3.org from January 2010)

From: Graham Klyne <GK@ninebynine.org>
Date: Sun, 17 Jan 2010 09:27:00 +0000
To: Shelley Powers <shelley.just@gmail.com>
CC: HTMLWG WG <public-html@w3.org>
Message-ID: <4B52D7E4.6020909@ninebynine.org>
Having skimmed the messages in this thread to date, I feel the focus on 
*browser* implementations is ignoring wider concerns of non-browser applications 
that consume the embedded data, and that the focus on syntax is distracting from 
what really matters, the underlying semantic model.

The mediawiki thread cited by Shelley notes that there is some ambiguity in the 
semantics of the microdata presentation, but that's relatively easily fixed, I 
think (just ensure the unqualified properties are mapped implicitly to a full 
URI, which in turn is described by an RDF schema or OWL).

So while I agree that it is very unfortunate that there are two competing 
proposals, I don't think it's fatal *provided* that there is a clear mapping to 
underlying common semantics.  In this case, RDF is firmly established in W3C as 
a semantic model for metadata, so I argue that all proposals should provide a 
clear mapping from their particular syntax to the RDF abstract syntax.  The 
existing semantics of RDF can take care of the rest.

I have, in the past, seen similar situations in standards bodies (I'm thinking 
of instant messaging in the IETF) where there are competing approaches that 
cannot be reconciled (either for technical or political reasons or both), and it 
is next-to-impossible to get the community to put weight behind a common effort. 
It's very unfortunate, and can waste a lot of time, but it seems there's little 
that can be done about that.  Once the lines have hardened, or if the underlying 
approaches are truly incompatible, it's hard to see how any course of action 
other than letting all efforts proceed can avoid fracturing the community.

In this case, I think we see opposing lines forming between the web data 
communities and the browser and web document communities.  In my perception RDFa 
already has a good head of steam (implementation and data) among a large 
grass-roots web data community, which it seems is now finding itself set against 
a smaller but individually more established group of browser makers and web 
publishing interests.  Microdata may be easier for browser implementers and web 
site designers, but (apparently - I haven't read the spec in detail) lacks some 
of the flexibility of RDFa to express arbitrary RDF.  In this respect, I imagine 
Microdata will never completely satisfy the web data community.

So where does all this lead me?  Two possibilities I see:

1. By far the best outcome would be for the two sides to get together and work 
out how to make a single standard for embedding data in HTML5.  Technically, I 
don't think this should be that hard:  one approach might be to start with RDFa 
and then devise short cuts to make certain common forms more microdata-like. 
Maintaining compatibility with RDFa would be highly desirable because of the 
existing base of RDFa data.  Maybe one could even allow a Microdata "profile" of 
the combined spec that browsers commonly understand - I don't think it's so 
harmful if browsers don't recognize all of RDFa, as long as they don't choke on it.

2. Allow both specs to proceed independently, but *require* both to define a 
mapping to the RDF abstract model (and hence semantics).  While this would 
create more work for implementers (especially of applications or tools that 
*consume* embedded data), it would at least prevent the Microdata and RDFa 
worlds from coming completely disconnected.  (This mapping is clearly a done 
deal for RDFa;  for Microdata, it doesn't require that browsers and other 
Microdata consumers actually implement any RDF processing, just that what 
processing they do remains consistent with interpreting data according to RDF 
semantics.)

To allow both efforts to proceed without a common underlying model would, I 
think, be as harmful as Shelley suggests, because it would potentially create 
two non-interoperable webs of data, and all the confusion and paralysis that 
would entail.

#g
Received on Sunday, 17 January 2010 10:08:42 UTC