Re: @lang and @datatype interaction

On 17 April 2013 14:01, Gregg Kellogg <gregg@greggkellogg.net> wrote:

> On Apr 17, 2013, at 5:13 AM, Reece Dunn <msclrhd@googlemail.com> wrote:
>
> Hi,
>
> I am adding RDFa-based metadata to my website and have observed different
> behaviour in RDFa processors with @lang and @datatype interaction.
>
> I have created two test cases:
>
> cat > testcase1.html < EOF
> <!DOCTYPE html>
> <html xmlns="http://www.w3.org/1999/xhtml" prefix="dct:
> http://purl.org/dc/terms/ xsd: http://www.w3.org/2001/XMLSchema# s:
> http://schema.org/">
>  <head>
>   <title>Test Case</title>
>  </head>
>  <body about="http://example.org/foo">
>   <p property="dct:date" datatype="xsd:string"
> content="2010-11-12">2010</p>
>  </body>
> </html>
> EOF
>
> and:
>
> cat > testcase2.html < EOF
> <!DOCTYPE html>
> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" prefix="dct:
> http://purl.org/dc/terms/ xsd: http://www.w3.org/2001/XMLSchema# s:
> http://schema.org/">
>  <head>
>   <title>Test Case</title>
>  </head>
>  <body about="http://example.org/foo">
>   <p property="dct:date" datatype="xsd:string"
> content="2010-11-12">2010</p>
>  </body>
> </html>
> EOF
>
> The difference is that testcase2.html specifies an @lang attribute in the
> root html node.
>
> NOTE: The xmlns="http://www.w3.org/1999/xhtml" is to work around a
> segfault in the version of rapper I am using (fixed in the latest release).
>
> Running `rapper -i rdfa -o turtle testcase1.html` and `rapper -i rdfa -o
> turtle testcase2.html` I get:
> -----
> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> @prefix : <http://www.w3.org/1999/xhtml> .
>
> <http://example.org/foo>
>     <http://purl.org/dc/terms/date> "2010-11-12"^^<
> http://www.w3.org/2001/XMLSchema#string> .
> -----
>
> Using RDF::RDFa:
> -----
> require 'rdf/rdfxml'
> require 'rdf/rdfa'
> require 'rdf/turtle'
>
> prefixes = {}
> graph1 = RDF::Graph.load("testcase1.html", :prefixes => prefixes)
> graph2 = RDF::Graph.load("testcase2.html", :prefixes => prefixes)
>
> puts graph1.dump(:turtle, :prefixes => prefixes)
> puts '==='
> puts graph1.dump(:rdfxml, :prefixes => prefixes)
> puts '==='
> puts graph2.dump(:turtle, :prefixes => prefixes)
> puts '==='
> puts graph2.dump(:rdfxml, :prefixes => prefixes)
> -----
>
> I get:
> -----
> @prefix dct: <http://purl.org/dc/terms/> .
> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
>
> <http://example.org/foo> dct:date "2010-11-12"^^xsd:string .
> ===
> <?xml version="1.0" encoding="UTF-8"?>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dct="http://purl.org/dc/terms/">
>   <rdf:Description rdf:about="http://example.org/foo">
>     <dct:date rdf:datatype="http://www.w3.org/2001/XMLSchema#string
> ">2010-11-12</dct:date>
>   </rdf:Description>
> </rdf:RDF>
> ===
> @prefix dct: <http://purl.org/dc/terms/> .
> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
>
> <http://example.org/foo> dct:date "2010-11-12"@en^^xsd:string .
> ===
> <?xml version="1.0" encoding="UTF-8"?>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dct="http://purl.org/dc/terms/">
>   <rdf:Description rdf:about="http://example.org/foo">
>     <dct:date xml:lang="en">2010-11-12</dct:date>
>   </rdf:Description>
> </rdf:RDF>
> -----
>
> Here, the turtle serialisation is wrong (a literal cannot have both a
> language and a datatype specifier) and the RDF/XML serialisation is using
> the language instead of the datatype. It looks like the RDFa processor is
> incorrectly including both a language and datatype.
>
>
> This may be a recent regression in RDF::RDFa, which I'll look into. There
> has been som recent refactoring of the core RDF support, as xsd:string has
> different meaning in RDF 1.1, but the lexical form is certainly wrong.
>

I am also seeing this with other datatypes specified -- the case I am
seeing this with is with <span property="s:countriesSupported"
datatype="dct:RFC5646" content="af">Afrikaaans<span>.

Also, I mistyped the example (copy-paste error) where the datatype should
have been xsd:date.


> Likely, both current datatype and language are provided when the literal
> is created, and it's up to the literal class to get this right, but it's
> not. Depending on which Turtle serializer your using, this may be an
> interface issue.
>

$ gem list | grep rdf
rdf (1.0.5, 0.3.11.1)
rdf-rdfa (1.0.0)
rdf-rdfxml (1.0.1)
rdf-turtle (1.0.4)
rdf-xsd (1.0.0)

Is it possible to add these test cases to the EARL tests?
>
>
> I would think we'd already have a similar test, but there may be something
> special about this particular form that's tickling the bug.
>

I am seeing this when lang is specified on the html element and a datatype
on a child node. Test 0172 is closest, but uses datatype="". I don't see a
test that has a specific lang attribute and a non-empty datatype attribute.

> What is the correct behaviour in this case? I am expecting @datatype to
> override @lang as the rapper tool is doing.
>
> Thanks,
- Reece

Received on Wednesday, 17 April 2013 13:56:02 UTC