- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Mon, 4 Dec 2006 00:23:11 +0100
- To: Marc <marc@geonames.org>
- Cc: Chris Bizer <chris@bizer.de>, Semantic Web <semantic-web@w3.org>
Marc, On 3 Dec 2006, at 23:04, Marc wrote: > We are already working on linking wikipedia articles with geonames > places. See also this thread in October : http://lists.w3.org/ > Archives/Public/semantic-web/2006Oct/0148.html > > Now that you are asking for it, I have released today a first > version, which includes the following wikipedia information about > Embrun : > > <wikipediaArticle>http://fr.wikipedia.org/wiki/Embrun_%28Hautes- > Alpes%29</wikipediaArticle> > <wikipediaArticle>http://pl.wikipedia.org/wiki/Embrun</ > wikipediaArticle> > <wikipediaArticle>http://de.wikipedia.org/wiki/Embrun</ > wikipediaArticle> > <wikipediaArticle>http://en.wikipedia.org/wiki/Embrun%2C_Hautes- > Alpes</wikipediaArticle> > <wikipediaArticle>http://it.wikipedia.org/wiki/Embrun</ > wikipediaArticle> > <wikipediaArticle>http://nl.wikipedia.org/wiki/Embrun</ > wikipediaArticle> > > Around 100,000 geonames place names now have wikipedia links. Very cool. I wonder how you link the articles? Can't be simple word matching, no? One detail: You should use the URI syntax, not the literal syntax. That is, instead of this: <wikipediaArticle>http://nl.wikipedia.org/wiki/Embrun</ wikipediaArticle> it should look like this: <wikipediaArticle rdf:resource="http://nl.wikipedia.org/wiki/ Embrun"/> Just like the other links that are already in the RDF data, e.g. nearbyFeatures. Best, Richard > >> I once read about some pretty sophisticated screen-scraping >> frameworks > As far as I know crawling and screen scraping wikipedia is not > considered fair use. Wikimedia software is rather resource > intensive and the preferred way is to use the xml download files : > http://download.wikipedia.org/ > > Cheers > > Marc > >
Received on Sunday, 3 December 2006 23:23:20 UTC