- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Tue, 30 Sep 2008 17:14:59 -0400
- To: Manu Sporny <msporny@digitalbazaar.com>
- CC: RDFa Community <public-rdfa@w3.org>
Manu Sporny wrote: > Kingsley Idehen wrote: > >> Manu, >> >> Here is example of how we RDFize Digg: >> http://tinyurl.com/3jtysp >> > > Neat - first time I've used OpenLink Data Explorer. :) > > I found the "What" section most useful and the "When" and "SVG Graph" > sections neat. I think there is a great amount of untapped usefulness in > the "When" timeline view for news and comments. > Yes, lots :-) Remember, these are all OAT components [1] and par tof the OAT Open Source project. > The data was interesting to look at as well, it has a couple of things > about Digg that I didn't think of marking up - namely: > > sioc:has_reply (for posts) > sioc:container_of (which could be reasoned given the rest of the > information on the page, but is good that it's marked > up - no need to make the reasoning agents guess.) > sioc_types:Comment (instead of sioc:Post - it's more accurate) > sioc:topic (is this directly related to sioc:Forum?) > > Is this stuff extracted automatically, Kingsley? Yes, what we call "RDF-ization" on the fly. Note how we also use proxies to fashion de-referencable URIs for the entites gleaned from these data spaces. > Is there a > Digg-specific data crawler? Yes, we have a Digg Cartridge amongst our growing collection of Cartridges [2]. > Curious as to how the crawler determines the > data as it's quite accurate. > > Long story, but to cut a long story short, we see information resources as data containers like dbms engines, and then the associated web services as the data container's call level interface (what exposes the container's data model). We've always developed data access drivers (ODBC, JDBC, OLE-DB, ADO.NET, XMLA) for major dbms engines and we see RDFization as just another aspect of the same thing; I covered some of this in my Linked Data planet presentation [3] . >> I'll have our omissions fixed so that we have a complete RDF based graph >> of Digg which should ultimately aid RDFa renditions of the original >> (X)HTML resources from Digg. >> > > Hopefully you won't have to update your omissions if Digg starts > publishing more SIOC RDFa :) > Sure. > Is there some other vocabulary that you have seen apply to other sites > that you think Digg might also use? > Also note, I made and contributed specific extensions to SIOC itself to enable coherent output from this kind of crawling, extraction, and mapping via spaces, containers, items (and specific spaces covering discussions, bookmarks, photo galleries etc..). Thus, SIOC covers all that required, and where additional specificity is available in some ontology, simply use "rdf:type" to set the types of the sioc:Items for the relevant sioc:Container. I ultimately want to make discourse graphs across Web data spaces much easier to exploit and discover. We are getting closer by the second. Links: 1. http://oat.openlinksw.com 2. http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets (* see RDFizer section *) 3. http://tinyurl.com/6gzelr (* this presentation is RDFa based so viewing it via ODE is a nice RDFa utility demo*) > -- manu > > -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Received on Tuesday, 30 September 2008 21:15:42 UTC