W3C home > Mailing lists > Public > semantic-web@w3.org > August 2006

Re: Is there real world RDF-S/OWL instance data?

From: Tim Finin <finin@cs.umbc.edu>
Date: Fri, 11 Aug 2006 18:31:27 -0400
Message-ID: <44DD053F.4040303@cs.umbc.edu>
To: semantic-web@w3.org
CC: Li Ding <dingli1@gl.umbc.edu>

Swoogle [1] has a collection of over 1M error-free RDF
documents collected from the Web and an additional ~700K
documents that have embedded RDF, are malformed but appear
to be RDF, or are no longer accessible.  We've intentionally
limited the number of simple RSS and FOAF documents in the
current collection.

Only about 5% of these documents contain *any* triples that
contribute to a definition.  The rest consist of all data.
We've determined that most of the 5% that contain
definitional triples do so incorrectly and should be all
data.  Of the remaining ones, many are duplicates and
copies.  We estimate that only about 1% of Swoogle's
collection are proper 'ontologies' that are intended to
(partially) define at least one named term.

That said, the vast majority of defined classes have *no*
immediate instances and the vast majority of properties have
*never* been used to assert a value.  Most defined RDF terms
have not been used.

[1] http://swoogle.umbc.edu/

  Tim Finin, Computer Science & Electrical Engineering, Univ of Maryland
  Baltimore County, 1000 Hilltop Cir, Baltimore MD 21250. finin@umbc.edu
  http://umbc.edu/~finin 410-455-3522 fax:-3969 http://ebiquity.umbc.edu
Received on Friday, 11 August 2006 22:29:50 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:41:00 UTC