W3C home > Mailing lists > Public > semantic-web@w3.org > July 2009

Re: Recipe for Shops: Showing up in Yahoo and in the Web of Data in One Turn

From: Martin Hepp (UniBW) <martin.hepp@ebusiness-unibw.org>
Date: Wed, 22 Jul 2009 15:26:10 +0200
Message-ID: <4A671372.8080301@ebusiness-unibw.org>
To: bnowack@semsol.com
CC: semantic-web at W3C <semantic-web@w3c.org>
Hi Benjamin, all:
First, thanks for the feedback!
Second - I just completely updated the page - among other things, I 
added datatypes to xsd:string literals, reduced the complexity of the 
linking between visible content and meta-data (e.g. opening hours specs).

All examples and content at

   * 
http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey

and the sample files at

   * http://www.heppnetz.de/searchmonkey/company.html and
   * http://www.heppnetz.de/searchmonkey/product.html

validate now with Yahoo! tools, W3C Markup Validation tools, and the RDF 
Validator (after extraction via pyRDFa).
As to your points:


Benjamin Nowack wrote:
> Interesting. I guess this is another argument/example pro Hugh 
> Glaser's idea of simply conflating resource IDs for the sake of
> "deployability". The example types <#business> as Vcard, Business
> and also as BusinessEntity which would usually be considered wrong
> RDF, but, as argued before, is more intuitive for HTML authors, 
> especially if they found their way to the SemWeb through pragmatic
> solutions like microformats. We should really give this contextual
> semantics idea another thought.
>   
Actually, I disagree completely.

Conflating multiple resources under one URI is deadly, because it 
comprises reuse and recombination of data.

In my example, commerce:Business and gr:BusinessEntity are practically 
equivalent classes, so this pair is rather a schema alignment than using 
one URI for distinct things.

Making the business entity also an instance of vcard:VCard is only 
because the upcoming vCard2006 cleansing is not yet available, in which 
the domain of vcard:adr is likely to be changed from vcard:VCard to a 
wider set of classes, because most locations, persons, or legal entities 
can have addresses - not only via your vcard. (You do have an address, 
not your business card.).

So again, this was only a work-around introduced by Yahoo! to make the 
whole thing fly now, not later. And the ontological nature of 
vcard:VCard is now understood pretty broad, subsuming commerce:Business 
and gr:BusinessEntity.

> (I fear you'll lose a significant chunk of the possible audience at 
> "change your DTD" and "add ... to the head tag", these sort of tweaks
> are not necessarily easy to do in CMS-based or commercial publishing 
> tools unless there is a dedicated plugin that is not erased with the 
> next site upgrade. For root/head-level changes, the content *authors*
> have to coordinate their tasks with the tech/site *admins*, which 
> leads to non-trivial friction loss and hence lowers the deployment 
> probability.)
>   
Well, there is nothing I can do about that, it is simply an important 
technical requirement. If you omit it, the content will no longer 
validate and data extraction and reuse turns from a predictable 
computational operation into probabilistic guesswork: it may work, or it 
may not. Then you are back in the realm of pure NLP. (almost ;-).

Note that Drupal now has a mode that activates an  automatic DOCTYPE 
replacement for serving RDFa. More info at:

http://drupal.org/node/391372

I think that at least such basic RDFa support will soon be a mandatory 
feature for any CMS on the market.
> A general suggestion would be to keep the added markup at a minimum,
> until GR is more deployed and people start asking for more on their
> own. Remove as many non-mandatory descriptions as possible, at least
> if the recipes are targeted at newcomers. Stuff like
> "ProductOrServicesSomeInstancesPlaceholder" or
> "LocationOfSalesOrServiceProvisioning" is probably not very 
> attractive to web dev people who are only just getting acquainted 
> with structured markup and want to check out if/how it works.
>   
Well - some of the element names in GoodRelations may be a bit long, but 
it was initially important to convey the precise meaning. Some could be 
shortened, but at this stage of quick adoption, I think a few characters 
more or less are not worth risking additional incompatibilities between 
evolving applications and data.
Also note that a typical shop etc. may have just a few HTML templates 
for e.g. the company and the product detail pages. Ten lines of 
additional markup may be worth it.
> I *personally* think that RDF-in-HTML snippets are most convincing 
> when the amount of additional RDF markup does not outweigh the 
> human-oriented content. 
IMO, there is dangerous tendency in part of current Web of Data 
research: After the frustration about the complexity (and limited 
impact) of logic-centric work, many researchers now want to keep things 
deadly simple. If you want really powerful meta-data, things will be 
more complex than adding "dc:title", I am afraid.

> Otherwise it becomes hard to track the 
> initial meaning of the page and examples become less illustrative.
> Maybe drop some of the @typeofs which repeat the @rel values (e.g. 
> as in
>    <div rel="gr:hasOpeningHoursSpecification">
>       <div typeof="gr:OpeningHoursSpecification">
> ), 
Maybe I did not get it, but I do not see a way how you can drop any of 
those without compromising the data - the typeofs are important for 
typing the nodes and the rels are important for typing the relationships.
> or cheat visually by picking some shorter, less Cyc/AI-like 
> predicate names, perhaps? 
>   
As said - there may be some cleansing for element IDs in the future, but 
all current GoodRelations updates will not invalidate any pre-existing 
data or applications.


Note that in the LOD cloud, there are now already 1 Million instances of 
gr:ProductOrServiceModel, some 45 k instances of gr:BusinessEntity, both 
not yet including the vast amount of data from the new RDF Book Mashup at

http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/

that exposes a large deal of book offers on the Web as GoodRelations data.
> Just some thoughts,
> Benji
>   

Again, thanks for the feedback!

Best

Martin

> --
> Benjamin Nowack
> http://bnode.org/
> http://semsol.com/
>
> On 21.07.2009 19:42:00, Martin Hepp (UniBW) wrote:
>   
>> Dear all:
>>
>> I just completed a recipe meant for larger audiences (Web developers,
>> SEO companies) on how a business can enrich its pages using
>> RDFa+GoodRelations so that the data
>> - shows up in Yahoo AND
>> - it at the same time useful for comprehensive RDF applications.
>>
>> The recipe is at
>>
>> http://tr.im/rAbN
>>
>> It tries to combine pure recipes from the RDF world with the "Web
>> developer's" how-tos provided by Yahoo.
>>
>> Any feedback is very welcome.
>>
>> Best
>>
>> Martin Hepp
>>
>>     
>   

-- 
--------------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail:  mhepp@computer.org
phone:   +49-(0)89-6004-4217
fax:     +49-(0)89-6004-4620
www:     http://www.unibw.de/ebusiness/ (group)
         http://www.heppnetz.de/ (personal)
skype:   mfhepp 
twitter: mfhepp

Check out the GoodRelations vocabulary for E-Commerce on the Web of Data!
========================================================================

Webcast:
http://www.heppnetz.de/projects/goodrelations/webcast/

Talk at the Semantic Technology Conference 2009: 
"Semantic Web-based E-Commerce: The GoodRelations Ontology"
http://tinyurl.com/semtech-hepp

Tool for registering your business:
http://www.ebusiness-unibw.org/tools/goodrelations-annotator/

Overview article on Semantic Universe:
http://tinyurl.com/goodrelations-universe

Project page and resources for developers:
http://purl.org/goodrelations/

Tutorial materials:
Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey

http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009





Received on Wednesday, 22 July 2009 13:26:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:13 UTC