Re: Unicode language character

"Karl Ove Hufthammer" <huftis@bigfoot.com> wrote:

> I remember having read somewhere that there is a proposed Unicode
> character for specifying the language of a block of text. Does anybody
> know anything about this? (I may be misinformed ...)

I think that's the Unicode Technical Report #7 "Plane 14 Characters for
Language Tags".  It's available from:

    http://www.unicode.org/unicode/reports/tr7/

Note that although this TR has already been approved by the Unicode
Technical Committee (UTC), those "language tags" are NOT part of
the Unicode Standard Version 3.0 nor ISO/IEC 10646 yet, so they
cannot be used in HTML nor XML yet.

> If it *is* true, then I think it's interesting from an accessibility
> perspective. We'll finally be able to change language in the middle
> of attributes (e.g. in the 'alt' attribute). Any thoughts on this?

There is a joint W3C-Unicode Technical Report #20, "Unicode in XML
and other Markup Languages", at:

    http://www.unicode.org/unicode/reports/tr20/

"3.8 Language Tag Characters" deals with those characters, and this TR
doesn't recommend to use those language tags in markup languages.
I do understand your motivation, but whenever possible, you'd better
avoid using attributes to include important information and use
elements with proper attributes (e.g. "lang" in HTML or "xml:lang"
in XML).

As for the "alt" attribute, the HTML Working Group is planning to
improve the syntax of the "img" element so that alt text can be
specified as an element's content rather than an attribute value
in the next generation of XHTML.  In this case you may change
language in the middle of alt text using the "xml:lang" attribute
via "span" or whatever appropriate element.

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Tuesday, 6 June 2000 23:20:19 UTC