- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Wed, 18 Mar 2009 02:32:07 -0400
- To: RDFa <public-rdf-in-xhtml-tf@w3.org>
- CC: "public-rdfa@w3.org" <public-rdfa@w3.org>
Ben Adida wrote: > Yahoo has launched even more RDFa coolness: embed RDFa on your site to > describe your flash games and videos, and they show up embedded in Yahoo > search results *for everyone*, *by default*. Overall, this is great news. Very nice to see Yahoo! adopting RDFa this deeply into their search service... do have some gripes about SearchMonkey Vocabularies, however... > PS: the only thing that's a bit unfortunate is that they didn't reuse > Digital Bazaar's media vocabulary. I hope we can find a way to create > equivalences at some point... that's the goal of RDF, after all. I've had a bit of time to look at Yahoo's published vocabularies and I'm quite concerned by them and Yahoo!s general direction with vocabulary design. Here's a list of issues that I was able to find... there are many more issues that I found than are outlined here. It would be good to talk with whoever designed their vocabularies. You can find an overview of Yahoo!s vocabularies here: http://developer.yahoo.com/searchmonkey/smguide/profile_vocab.html Issues specific to Yahoo's Media vocabulary: Vocabulary is not machine-readable, not validate-able ----------------------------------------------------- Yahoo's searchmonkey media vocabulary defined here: http://search.yahoo.com/searchmonkey/media/ is not machine-readable. There are no RDF ranges, subClassOf, comments, or types specified. New RDF vocabularies, especially ones from large companies like Yahoo, should be machine readable otherwise it's going to be nearly impossible to validate against them. Monolithic Vocabulary Design ---------------------------- Rather than break Media out into multiple different vocabularies, Yahoo has shoved audio, video, text, photos, thumbnails, re-invented sets, and shoved them into one monolithic vocabulary which will surely get more and more bloated as the years increase. Rather than create a nice vocabulary stack (like what we've been doing for the past several years): +--------------+ |Music Ontology| +--------------+-------+ | Audio | Video | +--------------+-------+ | Media | +----------------------+ They've instead created a mega vocabulary that doesn't seem to be backed up by any usage data... or rather, it certainly isn't backed up by the data we collected on the subjects of audio and video. Perhaps I'm missing some sort of grand architecture, but when you have media:Article and media:Text (neither of which subclass each other), then it shows that not a great deal of design work went into your vocabularies. Confounding Media with Media Format ----------------------------------- Yahoo defines the following properties in media: * media:bitrate * media:channels * media:duration * media:fileSize * media:framerate * media:height * media:samplingrate * media:type * media:width Most of these are quite specific to web-based media formats and have nothing to do with media in the physical world (not the Web). Many of these can't be used to describe media:Text or media:Article. These attributes really have nothing to do with media and should be separated out into a different media format vocabulary. * media:views This one has more to do with social news sites than media. Specification of medium using both class and property ----------------------------------------------------- Yahoo defines both this: media:Image, media:Audio, media:Video and this: media:medium - The type of object: image | audio | video | document | executable. What's the point of having both a 'medium' property and classes that define the medium? media:medium shouldn't exist at all - use one or the other, not both. Using both is confusing and will inevitably lead to more pain for Yahoo down the line when you have to look at not only @typeof information, but also medium information. Naming conflict, right off of the bat ------------------------------------- Yahoo has defined the following prefixes: commerce, media These conflict directly with ones that we've already created, which isn't that big of a deal - in fact, it shows that RDFa is resilient even in these scenarios. However, it also means that almost all of the solutions that have been proposed for addressing the "cut-paste fragility" issue that the WHATWG has raised are now much more difficult to implement correctly. Which commerce and which media vocabularies do we resolve to? I'm afraid that since Yahoo is the 300lb gorilla in the room, there will be no place for good vocabulary designs. These vocabularies will hurt RDFa adoption in the long run ---------------------------------------------------------- My real fear is that while Yahoo adopting RDFa will help in the short term, these badly designed vocabularies will hurt RDFa adoption in the long run. The worst-case scenario is seeing wide adoption of Yahoo's media vocabulary as it currently stands, which will eventually come under much harsher and less constructive criticism than I've outlined above. As I stated earlier, there are many more issues with what Yahoo has done with their SearchMonkey vocabularies that should be fixed for the benefit of this community. We are more than glad to help them work through the issues, as long as Yahoo is willing to have an open dialog with the RDF vocabulary creation community. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Absorbing Costs Considered Harmful http://blog.digitalbazaar.com/2009/02/27/absorbing-costs-harmful
Received on Wednesday, 18 March 2009 06:32:47 UTC