Re: @lang and @datatype interaction

On Apr 17, 2013, at 5:13 AM, Reece Dunn <msclrhd@googlemail.com> wrote:

> Hi,
> 
> I am adding RDFa-based metadata to my website and have observed different behaviour in RDFa processors with @lang and @datatype interaction.
> 
> I have created two test cases:
> 
> cat > testcase1.html < EOF
> <!DOCTYPE html>
> <html xmlns="http://www.w3.org/1999/xhtml" prefix="dct: http://purl.org/dc/terms/ xsd: http://www.w3.org/2001/XMLSchema# s: http://schema.org/">
>  <head>
>   <title>Test Case</title>
>  </head>
>  <body about="http://example.org/foo">
>   <p property="dct:date" datatype="xsd:string" content="2010-11-12">2010</p>
>  </body>
> </html>
> EOF
> 
> and:
> 
> cat > testcase2.html < EOF
> <!DOCTYPE html>
> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" prefix="dct: http://purl.org/dc/terms/ xsd: http://www.w3.org/2001/XMLSchema# s: http://schema.org/">
>  <head>
>   <title>Test Case</title>
>  </head>
>  <body about="http://example.org/foo">
>   <p property="dct:date" datatype="xsd:string" content="2010-11-12">2010</p>
>  </body>
> </html>
> EOF
> 
> The difference is that testcase2.html specifies an @lang attribute in the root html node.
> 
> NOTE: The xmlns="http://www.w3.org/1999/xhtml" is to work around a segfault in the version of rapper I am using (fixed in the latest release).
> 
> Running `rapper -i rdfa -o turtle testcase1.html` and `rapper -i rdfa -o turtle testcase2.html` I get:
> -----
> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> @prefix : <http://www.w3.org/1999/xhtml> .
> 
> <http://example.org/foo>
>     <http://purl.org/dc/terms/date> "2010-11-12"^^<http://www.w3.org/2001/XMLSchema#string> .
> -----
> 
> Using RDF::RDFa:
> -----
> require 'rdf/rdfxml'
> require 'rdf/rdfa'
> require 'rdf/turtle'
> 
> prefixes = {}
> graph1 = RDF::Graph.load("testcase1.html", :prefixes => prefixes)
> graph2 = RDF::Graph.load("testcase2.html", :prefixes => prefixes)
> 
> puts graph1.dump(:turtle, :prefixes => prefixes)
> puts '==='
> puts graph1.dump(:rdfxml, :prefixes => prefixes)
> puts '==='
> puts graph2.dump(:turtle, :prefixes => prefixes)
> puts '==='
> puts graph2.dump(:rdfxml, :prefixes => prefixes)
> -----
> 
> I get:
> -----
> @prefix dct: <http://purl.org/dc/terms/> .
> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
> 
> <http://example.org/foo> dct:date "2010-11-12"^^xsd:string .
> ===
> <?xml version="1.0" encoding="UTF-8"?>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dct="http://purl.org/dc/terms/">
>   <rdf:Description rdf:about="http://example.org/foo">
>     <dct:date rdf:datatype="http://www.w3.org/2001/XMLSchema#string">2010-11-12</dct:date>
>   </rdf:Description>
> </rdf:RDF>
> ===
> @prefix dct: <http://purl.org/dc/terms/> .
> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
> 
> <http://example.org/foo> dct:date "2010-11-12"@en^^xsd:string .
> ===
> <?xml version="1.0" encoding="UTF-8"?>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dct="http://purl.org/dc/terms/">
>   <rdf:Description rdf:about="http://example.org/foo">
>     <dct:date xml:lang="en">2010-11-12</dct:date>
>   </rdf:Description>
> </rdf:RDF>
> -----
> 
> Here, the turtle serialisation is wrong (a literal cannot have both a language and a datatype specifier) and the RDF/XML serialisation is using the language instead of the datatype. It looks like the RDFa processor is incorrectly including both a language and datatype.

This may be a recent regression in RDF::RDFa, which I'll look into. There has been som recent refactoring of the core RDF support, as xsd:string has different meaning in RDF 1.1, but the lexical form is certainly wrong.

Likely, both current datatype and language are provided when the literal is created, and it's up to the literal class to get this right, but it's not. Depending on which Turtle serializer your using, this may be an interface issue.

> Is it possible to add these test cases to the EARL tests?

I would think we'd already have a similar test, but there may be something special about this particular form that's tickling the bug.

> What is the correct behaviour in this case? I am expecting @datatype to override @lang as the rapper tool is doing.

That is the correct behavior. It is invalid to specify both language and datatype, although in RDF 1.1 having a language and datatype of rdf:langString would be consistent, but not as lexical Turtle or N-Triples. This will need to be addressed for RDFa 1.1 in the future.

Gregg

> Thanks,
> - Reece

Received on Wednesday, 17 April 2013 13:42:24 UTC