Re: RDFa and Web Directions North 2009 from Ian Hickson on 2009-02-13 (public-rdfa@w3.org from February 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 13 Feb 2009 21:49:45 +0000 (UTC)
To: Kjetil Kjernsmo <kjetil@kjernsmo.net>, Michael Hausenblas <michael.hausenblas@deri.org>
Cc: Dan Connolly <connolly@w3.org>, Sam Ruby <rubys@intertwingly.net>, Dan Brickley <danbri@danbri.org>, Michael Bolger <michael@michaelbolger.net>, public-rdfa@w3.org, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Tim Berners-Lee <timbl@w3.org>, Jeremy Carroll <jeremy@topquadrant.com>
Message-ID: <Pine.LNX.4.62.0902132117010.952@hixie.dreamhostps.com>

On Fri, 13 Feb 2009, Kjetil Kjernsmo wrote:
> 
> I can't speak for the RDFa community, but the reason you can't see a lot 
> of problem descriptions separate from technical solution is probably 
> that the community feels that RDF is a well established technology, and 
> so the focus is on showing how it is used rather than abstract 
> speculation on how it could be used.

It's certainly well-established in certain circles, but it's unfortunately 
the case that technologies have to rejustify themselves each time they 
enter new areas. RDFa isn't currently that well established as a general 
authoring language, and most authors haven't interacted with RDF knowingly 
at all.

RDF is not unique in this regard, by the way. HTML itself has had to 
reprove itself many times; currently there are many developers who are 
trying to decide what language to use to develop their next application, 
and HTML is a new contender in that race. Ten years ago, few people 
outside of the cutting-edge browser space would have thought to make an 
application in HTML, but after making its case to developers, it is now a 
seriously considered option.

> As for RDF use cases, please see e.g. 
> http://www.w3.org/2001/sw/sweo/public/UseCases/

There is a significant difference between case studies (examples of actual 
usage) and problem descriptions (examples of actual problems that might 
lead or might have led to usage). To be blunt, the existence of something 
using a technology is not an indication that the technology was a good 
solution. It can, however, lead to very useful experience: do any of the 
case studies listed above have frank evaluations of whether Semantic Web 
technologies have been successful? Most interesting would be reports from 
failed experiment -- the Semantic Web, like any technology, is not going 
to be right for everything; to what has it been found to _not_ be well 
suited? (The existence of reports showing failure increases the 
credibility of reports showing success.)

On Fri, 13 Feb 2009, Michael Hausenblas wrote:
> 
> In [1] you asked, quite rightly, for 'problem statements' re RDFa. I've 
> pointed out two (IMHO important) ones at [1] which you *might* have 
> overlooked. I'd be happy to learn from you if you think these are 
> 'acceptable':
> 
> 1. Service and product provider can't include the meaning of the things 
> they publish in HTML. For example, how do you find out where the price 
> of a book is located in, say, a page from Amazon? Now, people that want 
> to use this data are forced to perform *screen scraping*, that is, there 
> is a need for publisher-push rather than consumer-pull semantics.
>
> 2. People doing data mash-ups need to learn a plethora of APIs/formats 
> while all they would likely want is *one data model* + and a bunch of 
> vocabularies covering the domain.
> 
> [1] http://realtech.burningbird.net/semantic-web/semantic-markup/stop-justifying-rdfa

I hadn't seen your comment (though I had noted some of the other comments 
from that blog entry), but I have now added it to my list, thanks.

It should be noted that there are pretty simple solutions to both of the 
above, though. For example, for case 1 Amazon could just say "anything 
with class=price indicates the price for the item described by the nearest 
ancestor block with class=item" or some such, or they could expose the 
information in a much simpler way by having a "&format=json" mode for 
their pages that is purely machine-readable data. Or they could do what 
they in fact do do, which is expose this using a dedicated API:

   http://docs.amazonwebservices.com/AWSEcommerceService/2006-05-17/ApiReference/ItemLookupOperation.html

What we find is that in fact RDFa would not solve their problem here, 
since they apparently feel they need (for whatever reason) to track 
per-developer usage of this information. Thus, they require a unique URI 
to be used for each developer obtaining the information, and would 
presumably therefore not _want_ to expose it on their main product page.

With the case 2, I don't see how forcing all data into one data model 
actually helps anybody. If you want to merge file system metadata, then 
you want a tree structure. If you want to merge family history data, you 
want a directed graph. If you want to perform a scripted operation on a 
set of binary files, then some a script object and a dictionary mapping 
filenames to binary blobs is probably most useful.

The difficulty with dealing with data from multiple sources is rarely the 
data format (a problem not solved by RDF anyway) or the data model, it's 
usually with the semantics of the vocabularies involved. For example, 
merging MP3/ID3 data (dedicated vocabulary with dedicated format embedded 
in MP3 files) with an iTunes library data dump (dedicated vocabulary with 
XML format) would not be easier if they were both expressed as RDF using 
different vocabularies. If anything, frankly, the problem would get 
harder. One is reminded of jwz's infamous quip about regular expressions.

This isn't to say that RDF doesn't have its uses, of course it does. The 
question is what are they, and do they justify adding syntax to HTML.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 13 February 2009 21:50:31 UTC