- From: Dan Brickley <danbri@w3.org>
- Date: Mon, 19 Aug 2002 09:23:02 -0400 (EDT)
- To: Andreas Eberhart <andreas.eberhart@i-u.de>
- cc: Danny Ayers <danny666@virgilio.it>, <www-rdf-interest@w3.org>
On Mon, 19 Aug 2002, Andreas Eberhart wrote: > > > Hi Danny, > > > The paper (2.4) states that "RDF subjects, predicates and most objects are > > URLs themselves..." - errm, not! > > However it's interesting that you did get a good number of links > > using this > > assumption. > > oops, you're right. It basically was a (desperate) attempt to find more RDF. Have you looked at having your crawler traverse rdfs:seeAlso references? Hunting for RDF in the general Web is like looking for the proverbial 'needle in a haystack'. If you start in a Web of interconnected RDF documents, a large part of the discovery problem vanishes. That's the approach I've been taking anyhow. See for example, http://rdfweb.org/people/danbri/rdfweb/danbri-foaf.rdf which contains markup such as... [[ ... <knows> <Person foaf:name="Edd Dumbill" foaf:nick="edd"> <rdfs:seeAlso web:resource="http://heddley.com/edd/foaf.rdf" /> <foaf:mbox web:resource="mailto:edd@usefulinc.com" /> <foaf:mbox web:resource="mailto:edd@xml.com" /> <foaf:mbox web:resource="mailto:edd@xmlhack.com" /> <foaf:homepage web:resource="http://heddley.com/edd/" /> </Person> </knows> ]] ...and if you dereference http://heddley.com/edd/foaf.rdf you'll similarly find more cross-references to other RDF documents in the Web (hence the name, "RDFWeb"). There are currently only a few hundred such documents in the RDFWeb/FOAF testbed, but it's been enough to convince me that the general approach has merit, and that a Web of small, independently maintained RDF documents can be more useful than having a few huge KBs dumped out in RDF/XML format. I really don't think finding RDF will be a problem. The trick will be in finding the _relevant_ RDF. For this, I think we need a combination of using rdfs:seeAlso, of mentioning the type of the thing described in the referenced document (eg. Person, Company etc), and of mentioning a type of the referenced document (eg. CV, Bibliography). Such hints make it easier to build smarter crawlers that won't be overwhelmed by the -- as yet nonexistent ;) -- mass of RDF data out there. Dan
Received on Monday, 19 August 2002 09:23:06 UTC