Re: Is there real world RDF-S/OWL instance data?

Harry Halpin wrote:
> This is absolutely the type of empirical data the SemWeb needs more of!
>   However - quick question - does this RDF data include RSS 1.0? Could you
> refactor the graphs without RSS 1.0??
>   Why exclude RSS 1.0? The fact that a triple is an "item" or a "title" to
> me doesn't really count as "instance" data, as these could easily been
> phrased in a non-RDF XML notation. To me interesting RDF vocabularies
> are ones that do identify things like "places" and "people" that XML
> doesn't do at all.  While interesting graph merging can be done over RSS
> 1.0, I do think this sort of thing is just almost just as well (minus
> merging) by XML and *lots*  of auto-generated RSS 1.0 data that could
> haven seems to vastly skew almost all statistics on the Semantic Web. 
> Another question - how much RDF data on the Web is RSS 1.0?
> How much is FOAF?

Good questions.  We're not able to easily recognize an RSS
document from the metadata we've put in our database.  I can
tell you that 318,908 documents in our collection use the
RSS namespace, but some of these may be richer documents.
Some RSS 1.0 documents are richer than others and use dc:
and other vocabularies to encode more information.  Also,
some documents may not be intended for syndication but will
use some rss 1.0 vocabulary.

But, it's likely that the vast majority of these 319K
documents are indeed RSS syndication documents and are also
100% data.

What we've been wanting to do is to work up a classifier
that can do a good job of recognizing a 'simple' rss
document and a 'simple' foaf document.  We'd include these
features along with the other metadata in our database.

Tim

-- 
  Tim Finin, Computer Science & Electrical Engineering, Univ of Maryland
  Baltimore County, 1000 Hilltop Cir, Baltimore MD 21250. finin@umbc.edu
  http://ebiquity.umbc.edu 410-455-3522 fax:-3969 http://umbc.edu/~finin

Received on Saturday, 12 August 2006 20:11:18 UTC