W3C home > Mailing lists > Public > public-rdfa@w3.org > September 2008

Re: Digg RDFa SIOC - Part 2 - Community and Forums

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 30 Sep 2008 17:14:59 -0400
Message-ID: <48E296D3.2040301@openlinksw.com>
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDFa Community <public-rdfa@w3.org>

Manu Sporny wrote:
> Kingsley Idehen wrote:
>> Manu,
>> Here is example of how we RDFize Digg:
>> http://tinyurl.com/3jtysp
> Neat - first time I've used OpenLink Data Explorer. :)
> I found the "What" section most useful and the "When" and "SVG Graph"
> sections neat. I think there is a great amount of untapped usefulness in
> the "When" timeline view for news and comments.
Yes, lots :-)

Remember, these are all OAT components [1] and par tof the OAT Open 
Source project.
> The data was interesting to look at as well, it has a couple of things
> about Digg that I didn't think of marking up - namely:
> sioc:has_reply (for posts)
> sioc:container_of (which could be reasoned given the rest of the
>                    information on the page, but is good that it's marked
>                    up - no need to make the reasoning agents guess.)
> sioc_types:Comment (instead of sioc:Post - it's more accurate)
> sioc:topic (is this directly related to sioc:Forum?)
> Is this stuff extracted automatically, Kingsley? 
Yes, what we call "RDF-ization" on the fly.
Note how we also use proxies to fashion de-referencable URIs for the 
entites gleaned from these data spaces.
> Is there a
> Digg-specific data crawler? 
Yes, we have a Digg Cartridge amongst our growing collection of  
Cartridges [2].
> Curious as to how the crawler determines the
> data as it's quite accurate.
Long story, but to cut a long story short, we see information resources 
as data containers like dbms engines, and then the associated web 
services as the data container's call level interface (what exposes the 
container's data model).  We've always developed data access drivers 
(ODBC, JDBC, OLE-DB, ADO.NET, XMLA) for major dbms engines and we see 
RDFization as just another aspect of the same thing; I covered some of 
this in my Linked Data planet presentation [3] .
>> I'll have our omissions fixed so that we have a complete RDF based graph
>> of Digg which should ultimately aid RDFa renditions of the original
>> (X)HTML resources from Digg.
> Hopefully you won't have to update your omissions if Digg starts
> publishing more SIOC RDFa :)
> Is there some other vocabulary that you have seen apply to other sites
> that you think Digg might also use?
Also note, I made and contributed specific extensions to SIOC itself to 
enable coherent output from  this kind of crawling, extraction, and 
mapping via spaces, containers, items (and specific spaces covering 
discussions, bookmarks, photo galleries etc..). Thus, SIOC covers all 
that required, and where additional specificity is available in some 
ontology, simply use "rdf:type" to set the types of the sioc:Items for 
the relevant sioc:Container.

I ultimately want to make discourse graphs across Web data spaces much 
easier to exploit and discover. We are getting closer by the second.


1. http://oat.openlinksw.com
(* see RDFizer section *)
3. http://tinyurl.com/6gzelr  (* this presentation is RDFa based so 
viewing it via ODE is a nice RDFa utility demo*)
> -- manu


Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Tuesday, 30 September 2008 21:15:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:15:00 UTC