- From: Brian Suda <brian.suda@gmail.com>
- Date: Mon, 4 Sep 2006 22:42:02 +0000
- To: public-grddl-wg <public-grddl-wg@w3.org>
This is a first pass at the Guitar review senario GRDDL Primer. http://suda.co.uk/sandbox/GRDDL/Primer.htm Any suggestions/advice/ideas are more than welcome. I've never written a primer, so i'm not exactly sure what needs to be in/out. Some of the SPAQRL examples need to be created, but if there is something you don't understand or i need to explain more, please let me know. Hopefully, wednesday we can discuss this futher. On a side note, i am also trying to get all the software installed on a webserver to actually replicate what is being described. Primer: Using GRDDL & Microformats to Aggregating data Stephan wishes to buy a guitar, so decides to check reviews. There are various special interest publications online which feature musical instrument reviews. There are also blogs which contain reviews by individuals. Among the reviewers there may be friends of Stephan, people whose opinion Stephan values (e.g. well-known musicians and people whose reviews Stephan has found useful in the past). There may also be reviews planted by instrument manufacturers which offer very biased views. First, Steven needs to get a list of people he considers trusted sources into some sort of machine readable document. FoaF and vCard-RDF are both suitable sources to extract the data from. The question is how to get these values? Microformats define to simple formats which can easily convert between HTML and RDF through the use of GRDDL. To extract a vCard-RDF from HTML you can use (hCard2vcardrdf.xsl ???) which will transform an hCard encoded HTML document. <address class="vcard" id="smith-stephan"> <a href="http://example.org" class="fn url">Stephan Smith</a> </address> This snippit of HTML is converted into RDF with the use of the XSLT <?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" > <rdf:Description rdf:about="http://example.org/"> <vCard:FN>Stephan Smith</vCard:FN> <vCard:URL>http://example.org/</vCard:URL> </rdf:Description> </rdf:RDF> Another microformat that allows for more information to be gleaned from the document is XFN. XFN is the XHTML friends network. Using values in the rel attribute it is possible to assert the types of relationships between the site owner and their friends, colleagues, co-workers, etc. Since XFN values are found on 'a' elements, this gives us another resource to follow and look for more hCards and more XFN values. This allows for use to modify the circle of trust from our direct friends to first-order friends of our friends. <ul> <li><a href="http://" rel="met friend collegue">Peter Smith</a></li> <li><a href="http://" rel="met">John Doe</a></li> <li><a href="http://" rel="met">Paul Revere</a></li> </ul> Given a seed URL with XFN data, a GRDDL transformation can extract FoaF data about all of these people. That FoaF file will then give us an additional list of URLs that can be spidered for additional GRDDL vCard-RDF data about each friend. Another property in XFN is 'me' which is used for identification consolidation. With this value it is possible to say that the data over on site 1 is also me and should be considered as if it were from the my own site. This allows us to extend our ability to use different resources. For instance: <ul> <li><a href="http://del.icio.us/guitar-rocker45" rel="me">My Del.icio.us Link</a></li> <li><a href="http://claimid.com/guitar-rocker" rel="me">Me on ClaimID</a></li> <li><a href="http://guitar-rocker.com" rel="me">I love guitars</a></li> </ul> The power of the rel="me" and the identity consolidation is that it allows use to glean data from multiple sources and merge it all into a single RDF document about a single individual. The Del.icio.us links could be encoded into RDF and associated with a user "guitar-rocker45", but because of the rel="me" and any reciprocal to "example.org" assertions can be made that the bookmarks have an owner "Stephan Smith" who has an RDF-vCard at "example.org" and has data in other places on other services such as claimid.com and guitar-rocker.com. All of these can be merged to form a bigger picture of "Stephan Smith" at "example.org" On the Guitar site, there are product reviews for each guitar. The guitars are also marked-up with microformats so it is possible to extract machine-readable data about each item. Along with manufacturer data, each member of the site can also leave feedback about the item in the form of a review. Stephan's friend Peter Smith has written several reviews of a new guitars. Each review has a link to the reviewer, which in this case is a link back to Peter's profile page on the guitar site. We know that the profile page is Stephan's friend Peter by visual inspection, but a machine does not. Luckily, on Peter's profile page on the guitar site, it allows him to link back to his own personal site. This link has a rel="me" value. Now a machine can assert that the Peter on the Guitar site, is the same Peter that is listed in Stephan's XFN list, which was converted to FoaF, because the URLs resolve to the same resource. With all of these tools it is possible to find Stephan's friends and to find additional resources that we know those friends created. Using GRDDL is it possible to glean information about the guitar in the form of product specifications supplied by the manufacture and reviews from site members. Once we have this data as RDF it can be passed into a SPARQL engine and queries can be run on it. If Stephan was looking for a Guitar in a specific price range, by a certain manufacturer, a with specific review rating or higher, from a selected group of friends, we now have enough data in RDF to do just that. EXAMPLE SPARQL QUERY HERE The first restriction on the data can be a pass on manufacturer data such as price, type, etc. Once we have all the matching guitars, we can then restricted based on Stephan's friends' reviews. Using a seeded list of XFN URLs given by Stephan that are converted to FoaF, we can match the URLS to any URL from the vCard-RDF generated from the profile pages of the guitar members pages. Now we have a list of members that Stephan Trusts relative to the guitar site. We might also get a list of reviews that those trusted members have written. We can then execute a UNION on that original data restricted on Manufacturer specs, and the data from Stephan's friends reviews. The resulting set is a SPARQL result matching our original question. EXAMPLE SPARQL QUERY HERE This SPARQL result is in XML or JSON and can easily be consumed by another application. This can display the results on screen, email them to Stephan or it can be pulled into another application to search the web for the best prices on the short list of guitars. Brian Suda, $Id: Primer.html,v 0.01 2006/09/04 $ -- brian suda http://suda.co.uk
Received on Monday, 4 September 2006 22:42:18 UTC