Re: @lang and @xml:lang in XHTML+RDFa 1.1

> <html lang="ja" xml:lang="ja">
> ...
> <p xmlns:dc="http://purl.org/dc/terms/">
> Updated: <span property="dc:modified">2010-01-13</span>...
> </p>
>
> which generates weird triple:
>
> <> dc:modified "2010-01-13"@ja .

It is a weird triple, but the XHTML itself is at fault -- the RDF
generated suffers as a consequence.

Note that the word "Updated", which is an English, but not as far as I
know, a Japanese word, is tagged as being in Japanese too and will be
interpreted as such by all implementations of the xml:lang attribute --
not just RDFa processors.

This is an annoyance of language tagging in XHTML generally, and I don't
think it's RDFa's job to fix it. RDFa should simply use XHTML's built-in
mechanism for declaring languages (mo matter how annoying it may be to use
correctly) rather that trying to invent its own rules.

One minor tweak that *might* aleviate some of the pain in authoring
documents that include multiples languages and/or a mixture of linguistic
and non-linguistic content would be to ask RDFa processors to implement
special handling for a few ISO-639-2 codes. Here's my suggestions:

1. "mul" is the code for "multiple languages". This would generate a
literal tagged with language @mul as you'd expect, however it would be
treated as the same as xml:lang="" in terms of inheroting the language to
descendent elements. Example:

  <div xml:lang="mul" property="ex:test1" content="Foo">
    <span property="ex:test2">Bar</span>
    <span property="ex:test3" xml:lang="en">Baz</span>
  </div>

would generate:

  <> ex:test1 "Foo"@mul .
  <> ex:test2 "Bar" .
  <> ex:test3 "Baz"@en .

This would allow authors to markup the fact that an area of the page
contains multiple languages, and that RDFa processors should not try to
interpret the language of descendent elements without further prompts.

2. "zxx" is the code for non-linguistic content. Processors could
recognise xml:lang="zxx" as being equivalent to xml:lang="".

-Toby

Received on Wednesday, 13 January 2010 14:19:52 UTC