RE: Multi-Lingual Pages

Doing a bit more investigating this morning I uncovered the Microformats multilingual-example page [1] on the Microformat wiki. It appears that some example using Microformats are utilizing the lang attribute in practice, and I would assume that Microformats 2 would utilize this convention as well.

It seems that all three syntaxes could align on using the lang attribute, which fits with the proposed guidelines. I haven't received any direct feedback from the Microformat folks yet, I will update the list when I do.

J


[1] http://microformats.org/wiki/multilingual-examples 


-----Original Message-----
From: Jeni Tennison [mailto:jeni@jenitennison.com]
Sent: Mon 10/3/2011 2:36 AM
To: Myers, Jay
Cc: public-html-data-tf@w3.org
Subject: Re: Multi-Lingual Pages
 
Thank you, Jay :)

On 2 Oct 2011, at 22:40, Myers, Jay wrote:

> I will volunteer to investigate Microformat language support.
> 
> Thanks,
> 
> Jay
> 
> 
> On Oct 2, 2011, at 5:22 AM, "Jeni Tennison" <jeni@jenitennison.com> wrote:
> 
>> USE CASES
>> 
>> The microformats community has already collected a bunch of use cases for multi-lingual HTML documents [1].
>> 
>> My own use case is that on legislation.gov.uk we have items of legislation that are published in Welsh and English, and we want to be able to distinguish between the Welsh and English titles and descriptions when we list them.
>> 
>> On the consumer side, I imagine that those consumers who gather data across the web need to be able to distinguish between information about the same entity provided in different languages.
>> 
>> DISCUSSION
>> 
>> HTML has the lang attribute to indicate the language of a particular part of a document, which is reflected in the lang property within the DOM.
>> 
>> # Microformats #
>> 
>> Microformat processors could theoretically pick up on the language of a value when mapping into other formats. For example, hCalendar [2] processors could use the HTML language to provide a value for the LANGUAGE parameter in iCalendar [3]; hCard [4] processors could do similarly when mapping to vCard [5]. However, I can't see anything in the wiki specifying this. There's also nothing about language support in microformats-2 [6].
>> 
>> Does anyone here know anything more about microformat language support? Would someone volunteer to investigate?
>> 
>> # RDFa #
>> 
>> RDFa processors use the lang attribute when generating RDF [7], which supports language-tagged plain literals [8].
>> 
>> # Microdata #
>> 
>> The microdata data model [9] doesn't support language-tagged values and nor does microdata+json [10], but the lang DOM property is accessible through the API.
>> 
>> I think it's probably worth raising this as a bug report on microdata. Could someone with experience of raising bug reports on HTML5 put together some wording?
>> 
>> Assuming nothing changes, I can see a couple of possible workarounds here:
>> 
>>   * using different properties for values in different languages
>>   * using values that are items with 'value' and 'lang' properties
>> 
>> but neither of these reuse the HTML lang attribute which is the natural way for users to indicate language.
>> 
>> Is it worth us pushing these as best practices for people using microdata? If we were to, I think we should propose the creation of a standard language-tagged-value type in a common namespace, and push consumers to recognise it. Probably we should see what emerges from a bug report on microdata before spending time on this.
>> 
>> PROPOSED GUIDELINES
>> 
>> 1. Use the HTML lang attribute to indicate the language of different parts of the page
>> 
>> 2. If you are publishing pages that contain multiple languages, use RDFa [or microformats; pending input on microformats support] to mark up your data
>> 
>> 3. If you are consuming information from a set of pages that use different languages, ensure your data model includes language tagging and that your processor uses the lang DOM property when interpreting values
>> 
>> 
>> [1]:  http://microformats.org/wiki/multilingual-brainstorming
>> [2]:  http://microformats.org/wiki/hcalendar
>> [3]:  http://tools.ietf.org/html/rfc5545#section-3.2.10
>> [4]:  http://microformats.org/wiki/hcard
>> [5]:  http://tools.ietf.org/html/rfc6350#section-5.1
>> [6]:  http://microformats.org/wiki/microformats-2
>> [7]:  http://www.w3.org/TR/rdfa-core/#T-current-language
>> [8]:  http://www.w3.org/TR/rdf-concepts/#dfn-plain-literal
>> [9]:  http://dev.w3.org/html5/md/#the-microdata-model
>> [10]: http://dev.w3.org/html5/md/#json
>> --
>> Jeni Tennison
>> http://www.jenitennison.com
>> 
>> 
>> 

-- 
Jeni Tennison
http://www.jenitennison.com

Received on Tuesday, 11 October 2011 07:26:07 UTC