Re: Comments on Authoring Techs for XHTML & HTML I18N from Martin Duerst on 2005-09-21 (www-international@w3.org from July to September 2005)

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Wed, 21 Sep 2005 18:19:15 +0900
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: Christophe Strobbe <christophe.strobbe@esat.kuleuven.be>, www-international@w3.org
Message-Id: <6.0.0.20.2.20050921180445.094357d0@localhost>

At 13:07 05/09/21, Bjoern Hoehrmann wrote:
 >
 >* Martin Duerst wrote:
 >> >Technique 1 or 8: What would you recommend for content that has no
 >>natural language, e.g. type samples that include Latin, Greek and Cyrillic
 >>characters? (Joe Clark brought this issue to the attention of the WCAG WG:
 >>http://lists.w3.org/Archives/Public/w3c-wai-gl/2005AprJun/0144.html.)
 >>
 >>Use lang='' / xml:lang='', i.e. the empty string.
 >
 >Note that this is not allowed in HTML or XHTML. Using "und" is allowed.

Besides the point brought up by Tex, HTML4
was still using RFC 1766 the last time I looked, which
excludes a lot of tags for perfectly fine languages,
and it will exclude more once RFC 3066bis is approved.
This clearly doesn't make sense. The DTD of course
allows an empty lang field, so anything with lang=''
is perfectly valid HTML4.

As for XHTML, there are a lot of versions, and I have only
checked 1.0, but for that case, Section C9 (at
http://www.w3.org/TR/2002/REC-xhtml1-20020801/#C_7) says:

     Use both the lang and xml:lang attributes when specifying the
     language of an element. The value of the xml:lang attribute
     takes precedence.

As xml:lang allows an empty string (by virtue of the XML REC
allowing it explicitly, just for the purpose described by
Christophe), and XHTML doesn't disallow it, we are
perfectly fine for XHTML!

Regards,    Martin.

Received on Wednesday, 21 September 2005 09:26:16 UTC