Re: Question about Microdata to RDF Note and lang attribute from Gregg Kellogg on 2012-09-11 (public-html-data-tf@w3.org from September 2012)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Tue, 11 Sep 2012 01:54:40 -0400
To: KANZAKI Masahide <mkanzaki@gmail.com>
CC: "public-html-data-tf@w3.org" <public-html-data-tf@w3.org>, "public-vocabs@w3.org" <public-vocabs@w3.org>
Message-ID: <A35646F2-C3F9-4BAE-9295-5715B8B5DFA1@kellogg-assoc.com>

On Sep 10, 2012, at 9:52 PM, "KANZAKI Masahide" <mkanzaki@gmail.com<mailto:mkanzaki@gmail.com>> wrote:

Hello,

I have a question about examples in Appendix B in "Microdata to RDF"
Note [1], regarding @lang handling.

Given the first microdata example in Appendix B

[[
<dl itemscope
   itemtype="http://purl.org/vocab/frbr/core#Work"
   itemid="http://books.example.com/works/45U8QJGZSQKDH8N"
   lang="en">
<dt>Title</dt>
<dd><cite itemprop="http://purl.org/dc/terms/title">Just a Geek</cite></dd>
<dt>By</dt>
<dd><span itemprop="http://purl.org/dc/terms/creator">Wil Wheaton</span></dd>
...
]]

the second Turtle example shows the resulting RDF like

[[
<http://books.example.com/works/45U8QJGZSQKDH8N> a frbr:Work ;
 dc:creator "Wil Wheaton"@en ;
 dc:title "Just a Geek"@en ;
...
]]

However, according to the Algorithm, these literal nodes should not
have lang tag @en.


In section 1.1, the Note says

[[
although element names and HTML @lang attributes could be used to
provide datatype and language information for RDF data, this would be
contrary to the microdata specification.
]]

This is a description of the limitations of base microdata, when used to create the JSON serialization, not of the RDF mapping. In 1.1, we describe these limitations as violations of the base microdata spec.

and in 4.1, property value is defined as (after the @href and <time> treatment)

[[
Otherwise
The value is a plain literal created from element.itemValue with
language information set from the lang IDL attribute of the property
element.
]]

Therefore, lang tag of the resulting RDF node can be set only if the
element itself has @lang attribute (i.e. has lang IDL attribute
value), not its ancestors.

The HTML IDL attribute for .lang includes the @lang context of the element, including its ancestors. From [2]:

[[[
To determine the language of a node, user agents must look at the nearest ancestor element (including the element itself if the node is an element) that has a lang attribute in the XML namespace<http://www.w3.org/TR/2011/WD-html5-20110525/elements.html#attr-xml-lang> set or is an HTML element<http://www.w3.org/TR/2011/WD-html5-20110525/infrastructure.html#html-elements> and has a lang<http://www.w3.org/TR/2011/WD-html5-20110525/elements.html#attr-lang> in no namespace attribute set. That attribute specifies the language of the node (regardless of its value).
]]]

I wonder the above examples contradict to the Algorithm, and should be
noted in errata. Or am I missing some points, previous discussions,
etc?

I believe the examples are consistent with the algorithm.

Gregg

cheers,

[1] http://www.w3.org/TR/microdata-rdf/

[2] http://www.w3.org/TR/2011/WD-html5-20110525/elements.html#the-lang-and-xml:lang-attributes


--
@prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name
"KANZAKI Masahide"; :nick "masaka"; :email "mkanzaki@gmail.com<mailto:mkanzaki@gmail.com>"].

Received on Tuesday, 11 September 2012 05:55:42 UTC