Re: Language options in rdf.rb

I filed this bug: https://sourceforge.net/tracker/?func=detail&aid=3195356&group_id=190976&atid=935520

Gregg

On Feb 28, 2011, at 9:23 AM, Alex Kremer wrote:

> Gregg,
> 
> That would explain it. I didn't even think to look at the RDFa source since the properties were being returned properly elsewhere. I will chase the issue up with them. Thanks for helping out!
> 
> -Alex
> 
> On Feb 28, 2011, at 6:14 PM, Gregg Kellogg wrote:
> 
>> Looking at the source, it's returned as text/html, meaning that it is using the RDF::RDFa parser. The source is, indeed, in RDFa 1.0 format. This format depends on the xml:lang or lang tags from the element containing the literal, or any element in it's ancestry. In this case, the html element contains xml:lang="en". That's why the literal has a language tag of :en. It seems that DBPedia, in this case anyway, isn't properly attributing the language to the page.
>> 
>> If you get the RDF/XML version of the page (http://dbpedia.org/data/Vienna), they do properly set language tags, so you will get the proper language tag assigned to the literal.
>> 
>> It seems that DBPedia isn't properly setting xml:lang attributes on nodes when publishing the RDFa content. It would certainly be a good idea to file this as a bug at DBPedia. In the mean time, best make use of the RDF/XML feed.
>> 
>> Gregg
>> 
>> On Feb 27, 2011, at 8:11 AM, Alex Kremer wrote:
>> 
>>> Hi,
>>> 
>>> Apologies if the following seems very elementary, but here goes:
>>> 
>>> I'm trying to retrieve an abstract from a DBPedia page in English. The problem is it seems like rdf.rb thinks every result it gets back is english, even results in foreign languages:
>>> 
>>> graph = RDF::Graph.load("http://dbpedia.org/page/Vienna")
>>> dbp = RDF::Vocabulary.new("http://dbpedia.org/ontology/")
>>> 
>>> query = RDF::Query.new(:article => {dbp.abstract => :abstract})
>>> => #<RDF::Query:0x1094812c8 @solutions=[], @options={}, @variables={}, @patterns=[#<RDF::Query::Pattern:0x84a40770(?article <http://dbpedia.org/ontology/abstract> ?abstract .)>]>
>>> a = query.execute(graph)
>>> 
>>> a.first
>>> <RDF::Query::Solution:0x84ba74d8({:abstract=>#<RDF::Literal:0x812bc7b8("Wien ist die Bundeshauptstadt der Republik \u00D6sterreich und zugleich eines der neun \u00F6sterreichischen (...shortened for brevity...) gefolgt von Z\u00FCrich und Genf an zweiter und dritter Stelle."@en)>, :article=>#<RDF::URI:0x81724044(http://dbpedia.org/resource/Vienna)>})>
>>> 
>>> As you can see, rdf.rb seems to think the language for the first abstract is english, when in fact it's german. If I query DBPedia via their SPARQL endpoint I do get correct results, so I am sure their data isn't the problem here. I tried to filter the solutions by language per http://rdf.rubyforge.org/RDF/Query/Solutions.html but since they're all tagged with @en, they all come back when I ask for English.
>>> 
>>> Does anyone have any idea what could be causing this or how to solve it? Am I querying wrong? If so, how would I structure the query to get the proper language result? 
>>> 
>>> Thanks in advance!
>>> 
>>> -Alex
>>> 
>> 
> 

Received on Monday, 28 February 2011 19:04:58 UTC