W3C home > Mailing lists > Public > www-rdf-interest@w3.org > August 2004

Re: RSS data transience and the semantic web

From: Danny Ayers <danny666@virgilio.it>
Date: Thu, 26 Aug 2004 12:43:34 +0200
Message-ID: <412DBED6.1020807@virgilio.it>
To: Dan Zambonini <dan.zambonini@boxuk.com>
CC: "DuCharme, Bob \(LNG-CHO\)" <bob.ducharme@lexisnexis.com>, www-rdf-interest@w3.org

>>From: www-rdf-interest-request@w3.org [mailto:www-rdf-interest-request@w3.org] On Behalf Of DuCharme, Bob (LNG-CHO)

>>What role can RSS 1.0 play in the semantic web considering the transience of the data? 
>>Most data in RSS files today won't be there a month from now, as the files get updated until today's items fall off the list. Can any connections that would be useful for a web of information get built from such data? 


I do think CMS/blogging tool vendors should be encouraged to retain 
machine-readable versions of their output, even if just the metadata 
with pointers to content, along the lines suggested by the old name 'RDF 
Site Summary'. Hopefully some of the bigger search setups (Technorati 
etc) will be aggregating stuff for posterity already.

But you do raise a broader question regarding the transience of Semantic 
Web data. Some of the FOAF tinkerers have been talking about "MeNow" 
documents [1], rapidly changing RDF data that contains presence-like 
information - e.g. what music someone is listening to at this point in 
time. Put this alongside the recurring suggestion of RDF over Jabber 
(new approach maybe, merged into Atom? [2]) and I think you're looking 
at data for which persistence would be well down the list of priorities. 
(I don't know of any good reasons not to preserve everything possible, 
aside from effort and cost ;-)

If I understand my light reading correctly, the current SW languages/SW 
architecture isn't well-tuned for temporal reasoning (though I seem to 
remember seeing some of this stuff around DLs). Long term I don't know 
if more logical infrastructure is needed underneath, or whether 
everything we are likely to need in the foreseeable future can be 
covered using the current layers ("Climb every mountain, datestamp every 
statement..."). Interesting stuff, whatever.

Cheers,
Danny.

[1] http://schema.peoplesdns.com/menow/
[2] 
http://www.ietf.org/internet-drafts/draft-saintandre-atompub-notify-00.txt

Dan Zambonini wrote:

>Very interesting question...
>
>>From my perspective, it's not the transience of the RSS data that reduces its usefulness (for the semantic web), but the lack of URIs.  The exciting part of the semantic 'web' (for me!) is the inter-relationships of data/values - whereas most RSS data contains largely string literals for the title/description, and no statements that contain purely URIs for all three parts of the statement (unless extending RSS with say Dublin Core).  This doesn't allow for much of a 'web' of data to be constructed (although RSS does demonstrate the usefulness of web metadata - maybe we can use it as a stepping stone to getting people to create richer metadata?).
>
>Just my 2p.
>
>--------------------------------------
>Dan Zambonini
>Box UK
>Internet Development and Consultancy
>
>t: +44 (0)29 2022 8822
>f: +44 (0)29 2022 8820
>e: dan.zambonini@boxuk.com
>w: www.boxuk.com
>--------------------------------------
>-----Original Message-----
>From: www-rdf-interest-request@w3.org [mailto:www-rdf-interest-request@w3.org] On Behalf Of DuCharme, Bob (LNG-CHO)
>Sent: 25 August 2004 19:21
>To: 'www-rdf-interest@w3.org'
>Subject: RSS data transience and the semantic web
>
>What role can RSS 1.0 play in the semantic web considering the transience of the data? Most data in RSS files today won't be there a month from now, as the files get updated until today's items fall off the list. Can any connections that would be useful for a web of information get built from such data? 
> 
>It would seem sensible for sites to offer archives of their own RSS feeds, but I don't know of any that do. (I tried searching for a few at archive.org, and it never had more than 6 per year for any feed. A surprising amount of the ones I tried couldn't be stored there because of a robots.txt exclusion.) Is the application developer expected to archive all data harvested from crawls? Does anyone know of any applications that are doing this with RSS data? 
> 
>Or do we just consider RSS 1.0 to be an RDF application that's independent of the semantic web? 
> 
>just curious,
> 
>Bob DuCharme   www.snee.com/bob       <bob@  
>snee.com> weblog on linking-related topics: 
>http://www.oreillynet.com/pub/au/1191
>
>
>  
>


-- 

Raw
http://dannyayers.com
Received on Thursday, 26 August 2004 10:47:59 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:14:57 UTC