- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Sat, 12 Aug 2006 13:33:45 +0100
- To: Tim Finin <finin@cs.umbc.edu>, 'Semantic Web' <semantic-web@w3.org>
This is absolutely the type of empirical data the SemWeb needs more of! However - quick question - does this RDF data include RSS 1.0? Could you refactor the graphs without RSS 1.0?? Why exclude RSS 1.0? The fact that a triple is an "item" or a "title" to me doesn't really count as "instance" data, as these could easily been phrased in a non-RDF XML notation. To me interesting RDF vocabularies are ones that do identify things like "places" and "people" that XML doesn't do at all. While interesting graph merging can be done over RSS 1.0, I do think this sort of thing is just almost just as well (minus merging) by XML and *lots* of auto-generated RSS 1.0 data that could haven seems to vastly skew almost all statistics on the Semantic Web. Another question - how much RDF data on the Web is RSS 1.0? How much is FOAF? cheers, harry Tim Finin wrote: > > Tim Finin wrote: > > Swoogle [1] has a collection of over 1M error-free RDF > > documents collected from the Web and an additional ~700K > > ... > > Only about 5% of these documents contain *any* triples that > > contribute to a definition. The rest consist of all data. > > ... > > I posted some fresh data extracted from Swoogle on our research blog: > > http://ebiquity.umbc.edu/blogger/2006/08/12/is-there-real-world-rdf-sowl-instance-data/ > > > The data show the number of Semantic Web documents broken down by > schema vs. data and the percentage of classes and properties that > have been used to encode data. Tim > -- -harry Harry Halpin, University of Edinburgh http://www.ibiblio.org/hhalpin 6B522426
Received on Saturday, 12 August 2006 17:33:49 UTC