- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Sun, 28 Sep 2008 23:25:27 -0400
- To: RDFa Community <public-rdfa@w3.org>
If you take a look at the last part of the XHTML+RDFa and triples generated for this URL: http://rdfa.digitalbazaar.com/demos/digg/stemcells.html You will notice three additional concepts for each story submitted to Digg: Threads, Users and Posts. Each Digg story is effectively an sioc:Thread concept, with each User reply being an sioc:Post concept. Threads effectively contain Posts. Each user could be marked up using FOAF, but I tried to keep things simple and to see what could be done with pure SIOC and Dublin Core. Each Thread entry looks something like the following: <h3 id="title"> <a href="http://www.technologyreview.com43/Biotech/210/" rel="dcterms:source" property="dcterms:title"> At Last-Stem Cells without Side Effects! </a> </h3> <p> <em class="url">technologyreview.com</em> <span about="" typeof="sioc:Thread" property="dcterms:abstract"> Researchers at Harvard University, the Harvard Stem Cell Institute, and the MGH Center for Regenerative Medicine have found a way to create healthy stem cells from adult cells--no embryo required--using an adenovirus. </span> </p> ... <a rel="dcterms:creator" rev="sioc:creator_of" href="http://digg.com/users/JusTuring"> <div about="http://digg.com/users/JusTuring" typeof="sioc:User"> <span rel="sioc:avatar"> <img src="http://digg.com/users/JusTuring/s.png" alt="JusTuring" class="user-photo" height="16" width="16" /> </span> <span property="sioc:name">JusTuring</span> </div> </a> Digg was already marking up the source and title and date, the markup goes a bit further and states that the page is a Thread. It also marks up the person that posted it as a sioc:User and identifies their avatar (sioc:avatar) image and name (sioc:name). Here are the triples that are generated from the markup above (<> is short-hand for "current page"): <> dcterms:creator <http://digg.com/users/JusTuring> . <http://digg.com/users/JusTuring> sioc:creator_of <> . <> dcterms:source <http://www.technologyreview.com43/Biotech/210/> . <> dcterms:title "At Last-Stem Cells without Side Effects!" . <> rdf:type sioc:Thread . <> dcterms:abstract "Researchers at Harvard University, the..." . <http://digg.com/users/JusTuring> rdf:type sioc:User . <http://digg.com/users/JusTuring> sioc:avatar <http://digg.com/users/JusTuring/s.png> . <http://digg.com/users/JusTuring> sioc:name "JusTuring" . Comments on each thread are marked up as sioc:Post concepts with markup that looks like the following: <li about="http://digg.com/posts/129387" typeof="sioc:Post" resource="" class="l0" id="c19182108"> <div about="http://digg.com/users/VegasKill" typeof="sioc:User" rel="sioc:creator_of" resource="http://digg.com/posts/129387" property="sioc:name">VegasKill</div>, <a about="" rev="sioc:reply_to" href="http://digg.com/posts/129387"> <span property="dcterms:dateSubmitted" content="2008-09-26T13:24:31+01:00"> 36 minutes ago </span> </a>, -3/+1 <span property="sioc:content"> Scripts ahoy. But otherwise progress made. </span> </li> I had to fudge the post URL a bit to make this a bit easier to read, but basically, each post has an associated sioc:User who is the creator of the post and in which the post is a reply to the parent thread. Each post has a dcterms:dateSubmitted value as well as the content of the post. This results in the following triples: <http://digg.com/posts/129387> rdf:type sioc:Post . <http://digg.com/posts/129387> dcterms:dateSubmitted "2008-09-26T13:24:31+01:00" . <http://digg.com/posts/129387> sioc:content "Scripts ahoy. But otherwise progress made." . <http://digg.com/posts/129387> sioc:reply_to <> . <http://digg.com/users/VegasKill> rdf:type sioc:User . <http://digg.com/users/VegasKill> sioc:creator_of <http://digg.com/posts/129387> . <http://digg.com/users/VegasKill> sioc:name "VegasKill" . All of this is a first cut and there are many more relationships that you can mark up once this first cut is implemented. Some relationships, such as number of Diggs/votes, don't exist in the SIOC vocabulary. Perhaps somebody else knows about a vocabulary that has the concept of social news digging/voting? If not, feel free to create the vocabulary and put it online somewhere... perhaps after talking with some of the folks from Newsvine, Reddit, Delicious. There are other concepts in here that are arguable - such as the concept that the page is an sioc:Thread and the sioc:creator_of of that page/thread is the person that posted the story. A reasoning agent could mistakenly assume that the sioc:creator_of the page is also the creator of the ads and other items in the page, which isn't true. So, some care should be taken in making these decisions. In other words, ask yourself how the triples could be mis-understood by reasoning agents and UIs that need to display the triples to a data mining agent. Hope this helps, and please do ask questions if any of this doesn't make sense :) -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Received on Monday, 29 September 2008 03:26:05 UTC