Re: The Complexity Argument from Ian Hickson on 2009-10-16 (public-html@w3.org from October 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 16 Oct 2009 09:59:48 +0000 (UTC)
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, HTMLWG WG <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0910160935282.25383@hixie.dreamhostps.com>
On Tue, 22 Sep 2009, Manu Sporny wrote:
> Ian Hickson wrote:
> > On Fri, 18 Sep 2009, Manu Sporny wrote:
> >> RDFa is more complex than Microformats and Microdata. It is more 
> >> complex because the set of use cases are more complex. 
> >> Follow-your-nose, vocabulary validation, data typing, and inferencing 
> >> are just a few of the design goals for RDFa, based on the 
> >> requirements in the use cases.
> > 
> > What are these use cases? I took into account every use case that was 
> > mentioned anywhere I could find, for microdata, and microdata handles 
> > every one of those that RDFa handles -- the only ones I didn't solve 
> > are also not solved by RDFa.
> 
> I'm going to pick one and be terse because I don't have the time to make 
> an exhaustive list right now. Microdata doesn't support this use case:
> 
> http://rdfa.info/wiki/rdfa-use-cases#Publishing_an_RDF_Vocabulary_and_Validating_Usage

As far as I can tell, the RDF produced by the example in that section is 
isomorphic with that produced by this Microdata snippet:

   <dl itemscope
       itemtype="http://www.w3.org/2002/07/owl#DatatypeProperty
       itemid="http://purl.org/media#position">
    <dt><a href="http://purl.org/media#position">media:position</a></dt>
    <dd>
     <table>
      <tr>
       <td>Status</td>
       <td itemprop="http://www.w3.org/2003/06/sw-vocab-status/ns#term_status">stable</td>
      </tr>
      <tr>
       <td>Description</td>
       <td itemprop="http://www.w3.org/2000/01/rdf-schema#comment">The 
       position of the audio recording in an album, LP, playlist, top 10 
       list, podcast history or other ordered list of audio recordings.</td>
      </tr>
      <tr>
       <td>Datatype:</td>
       <td>
        <a itemprop="http://www.w3.org/2000/01/rdf-schema#range"
           href="http://www.w3.org/2001/XMLSchema#integer">xsd:integer</a>
       </td>
      </tr>
     </table>
    </dd>
   </dl>



> Specifically, it doesn't support datatyping... you can't do something
> like this in Microdata:
> 
> <span xmlns:measure="http://example.org/measure#"
>       about="#patient" property="measure:weight"
>       datatype="measure:kilograms">72</span>
> 
> which would generate this triple:
> 
> <#patient> measure:weight "72"^^measure:kilograms

Indeed. It is expected that vocabularies that use microdata will define 
the units of their properties, so that we don't repeat the mistake we 
ended up with with <meta scheme>.


> There are two important pieces of functionality here:
> 
> 1. The author can specify a datatype for the object literal because they
>    want to be very specific about the data. This is important in most
>    scientific fields, such as medicine. As an anesthesiologist, am I
>    setting up the operation for a 72 pound patient, or a 159 pound
>    patient?

Use vocabularies that define if it's kilograms or pounds, and then don't 
change from one to the other.


> 2. Tools can verify that the data is valid by validating it
>    against a vocabulary specification. For example, if the vocabulary
>    restricts all "measure:weight" values to be in "measure:pounds",
>    then the validator can provide vocabulary usage errors to
>    authors that don't provide a datatype for the measure as well
>    as those that use the wrong datatype for the measure. Applications
>    consuming the data may also flag the data as erroneous since they
>    don't validate against a single vocabulary specification.

The same is true for Microdata -- a vocabulary-specific validator can do 
datatype checking to verify that the data is in the expected and allowed 
ranges. (Nothing can stop an author from writing data in the wrong units, 
of course, just like nothing can stop an author from specifying the wrong 
units in RDFa.)


> Unless I'm missing something, Microdata doesn't support #1 or #2. So, 
> there are use cases that are solved by RDFa that are not solved by 
> Microdata. Did I make a mistake in understanding datatyping in 
> Microdata?

#1 is supported, as far as I can tell.

#2's use case is supported, just not with the flexibility in the syntax. 
That is, it's possible to specify fields that accept pounds, or kilograms, 
or whatever. I don't see, however, what the use case is for a field that 
accepts multiple values. In a future version, we could extend the syntax 
to have typed literals, but without a very clear use case for it, it's not 
a very compelling feature. (It's certainly a lot more complexity in the 
language than I'm happy with, and it exposes a part of RDF that isn't 
really as clean as the main "everything's a triple" model.)


> >> It's fairly clear that RDFa is more complex than Microformats and 
> >> Microdata, and I would say that is true because it solves a larger 
> >> set of problems.
> > 
> > What problems? Could you list the concrete user problems that RDFa 
> > addresses that make it more complex, and which microdata doesn't 
> > address?
> 
> I've outlined one such problem above, but there are others.

#1 seems to be possible.

#2 was more a feature, rather than a use case.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 16 October 2009 09:48:26 UTC