Re: RSS data transience and the semantic web

Sorry to come late to the party, I've been away.

As I recall, the PulpFiction newsreader (Mac) stores every retrieved 
news item, just like an email client; this is actually quite intuitive, 
as one doesn't expect an older item to disappear just because there's 
more recent news. I'm sure the RSS->mail converters do the same. In 
principle, this data could be made available through an RDF query 
interface.

Non-RSS feeds could be downloaded, parsed, and saved into the same 
store, then exposed as RDF (obviously a more restricted vocabulary than 
RSS 1.0).

In a more careful world, authors would expose an RSS feed of all their 
posts for posterity (some already do). However, I would prefer a more 
intelligent approach to news serialisation, where the client could 
query with a date range (or a proper query) and get the right results. 
In an all-RDF system, this could be OWL-QL, RDQL, or some other RDF 
query language.

None of these approaches solve the difficulties of poor description: 
inaccurate titling, bad markup, missing author markup, categories, and 
so on.

Now I ought to practice what I preach :)

-R

PS. I'm noodling away on a persistent, system-wide RDF store (based on 
Redland currently). I'm thinking that a persistent newsreader would be 
an interesting application.

> What role can RSS 1.0 play in the semantic web considering the 
> transience of the data? Most data in RSS files today won't be there a 
> month from now, as the files get updated until today's items fall off 
> the list. Can any connections that would be useful for a web of 
> information get built from such data?

Received on Monday, 30 August 2004 16:52:37 UTC