- From: Gregg Kellogg <gregg@kellogg-assoc.com>
- Date: Mon, 28 Feb 2011 12:14:43 -0500
- To: Alex Kremer <alex@entitylab.com>
- CC: "public-rdf-ruby@w3.org" <public-rdf-ruby@w3.org>
Looking at the source, it's returned as text/html, meaning that it is using the RDF::RDFa parser. The source is, indeed, in RDFa 1.0 format. This format depends on the xml:lang or lang tags from the element containing the literal, or any element in it's ancestry. In this case, the html element contains xml:lang="en". That's why the literal has a language tag of :en. It seems that DBPedia, in this case anyway, isn't properly attributing the language to the page. If you get the RDF/XML version of the page (http://dbpedia.org/data/Vienna), they do properly set language tags, so you will get the proper language tag assigned to the literal. It seems that DBPedia isn't properly setting xml:lang attributes on nodes when publishing the RDFa content. It would certainly be a good idea to file this as a bug at DBPedia. In the mean time, best make use of the RDF/XML feed. Gregg On Feb 27, 2011, at 8:11 AM, Alex Kremer wrote: > Hi, > > Apologies if the following seems very elementary, but here goes: > > I'm trying to retrieve an abstract from a DBPedia page in English. The problem is it seems like rdf.rb thinks every result it gets back is english, even results in foreign languages: > > graph = RDF::Graph.load("http://dbpedia.org/page/Vienna") > dbp = RDF::Vocabulary.new("http://dbpedia.org/ontology/") > > query = RDF::Query.new(:article => {dbp.abstract => :abstract}) > => #<RDF::Query:0x1094812c8 @solutions=[], @options={}, @variables={}, @patterns=[#<RDF::Query::Pattern:0x84a40770(?article <http://dbpedia.org/ontology/abstract> ?abstract .)>]> > a = query.execute(graph) > > a.first > <RDF::Query::Solution:0x84ba74d8({:abstract=>#<RDF::Literal:0x812bc7b8("Wien ist die Bundeshauptstadt der Republik \u00D6sterreich und zugleich eines der neun \u00F6sterreichischen (...shortened for brevity...) gefolgt von Z\u00FCrich und Genf an zweiter und dritter Stelle."@en)>, :article=>#<RDF::URI:0x81724044(http://dbpedia.org/resource/Vienna)>})> > > As you can see, rdf.rb seems to think the language for the first abstract is english, when in fact it's german. If I query DBPedia via their SPARQL endpoint I do get correct results, so I am sure their data isn't the problem here. I tried to filter the solutions by language per http://rdf.rubyforge.org/RDF/Query/Solutions.html but since they're all tagged with @en, they all come back when I ask for English. > > Does anyone have any idea what could be causing this or how to solve it? Am I querying wrong? If so, how would I structure the query to get the proper language result? > > Thanks in advance! > > -Alex >
Received on Monday, 28 February 2011 17:16:24 UTC