Re: Unicode language character

From: Karl Ove Hufthammer <huftis@bigfoot.com>
Date: Thu, 8 Jun 2000 15:11:10 +0200
Message-ID: <000101bfd163$c70f4f20$6a369fc3@huftis>
To: "Masayasu Ishikawa" <mimasa@w3.org>
Cc: <w3c-wai-ig@w3.org>
"Masayasu Ishikawa" <mimasa@w3.org> wrote:
| "Karl Ove Hufthammer" <huftis@bigfoot.com> wrote:
| > I remember having read somewhere that there is a proposed Unicode
| > character for specifying the language of a block of text. Does
| > know anything about this? (I may be misinformed ...)
| I think that's the Unicode Technical Report #7 "Plane 14 Characters
| Language Tags".  It's available from:
|     http://www.unicode.org/unicode/reports/tr7/
| Note that although this TR has already been approved by the Unicode
| Technical Committee (UTC), those "language tags" are NOT part of
| the Unicode Standard Version 3.0 nor ISO/IEC 10646 yet, so they
| cannot be used in HTML nor XML yet.

Thank you for the information.

| > If it *is* true, then I think it's interesting from an accessibility
| > perspective. We'll finally be able to change language in the middle
| > of attributes (e.g. in the 'alt' attribute). Any thoughts on this?
| There is a joint W3C-Unicode Technical Report #20, "Unicode in XML
| and other Markup Languages", at:
|     http://www.unicode.org/unicode/reports/tr20/
| "3.8 Language Tag Characters" deals with those characters, and this TR
| doesn't recommend to use those language tags in markup languages.
| I do understand your motivation, but whenever possible, you'd better
| avoid using attributes to include important information and use
| elements with proper attributes (e.g. "lang" in HTML or "xml:lang"
| in XML).

Well, I think making it *impossible* to specify the language of parts of
an attribute value is vary bad idea, especially in a źnew╗ technology
like XML is.

This is of course not just a problem with (X)HTML; it makes it
impossible to specifiy language information in attribute values in *all*
XML applications. This is very unfortunate.

| As for the "alt" attribute, the HTML Working Group is planning to
| improve the syntax of the "img" element so that alt text can be
| specified as an element's content rather than an attribute value
| in the next generation of XHTML.  In this case you may change
| language in the middle of alt text using the "xml:lang" attribute
| via "span" or whatever appropriate element.

Well, I can do this already by using the 'object' element, but there's
still problems. You can't use 'lang' or 'xml:lang' inside the 'title' or
'summary' attribute. It's often unavoidable to use words from several
languages as attribute values (e.g. proper names).

Karl Ove Hufthammer
Received on Thursday, 8 June 2000 12:08:57 UTC

