RE: How do I say 'this is not in any language' in XHTML/HTML

I have attempted to summarise the comments on this thread at 
http://esw.w3.org/topic/geoNoLanguageTag

I'm still not clear about the distinction between xml:lang="" and xml:lang="und".  Any suggestions?

RI

PS: Please don't edit the wiki. Respond via www-international.

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)
 
http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://people.w3.org/rishida/blog/
http://www.flickr.com/photos/ishida/
 
 

> -----Original Message-----
> From: www-international-request@w3.org 
> [mailto:www-international-request@w3.org] On Behalf Of Richard Ishida
> Sent: 13 March 2007 19:12
> To: www-international@w3.org
> Subject: How do I say ‘this is not in any language’ in XHTML/HTML
> 
> 
> This is an attempt to summarise and move forward some ideas 
> in a thread on www-international@w3.org by Christophe 
> Strobbe, Martin Duerst, Bjoern Hoermann and Tex Texin.
> http://lists.w3.org/Archives/Public/www-international/2005JulS
ep/0163.html
> 
> 
> 
> You should always use the lang and/or xml:lang attributes in 
> HTML or XHTML to identify the human language of the content 
> so that applications such as voice browsers, style sheets, 
> and the like can process that text. (See Declaring Language 
> in XHTML and HTML[1] for the details.)
> 
> You can override that language setting for a part of the 
> document that is in a different language, eg. some French 
> quotation in an English document, by using the same 
> attribute(s) around the relevant bit of text.
> 
> Suppose you have some text that is not in any language, such 
> as type samples, part numbers, perhaps program code. How 
> would you say that this was no language in particular?
> 
> There are a number of possible approaches:
> 
>    1. A few years ago we introduced into the XML spec the 
> idea that xml:lang=”" conveys that ‘there is no language 
> information available’. (See 2.12 Language Identification[2])
> 
>    2. An alternative is to use the value ‘und’, for ‘undetermined’.
> 
>    3. In the IANA Subtag Registry[3] there is another tag, 
> ‘zxx’, that means ‘No linguistic content’. Perhaps this is a 
> better choice. It has my vote at the moment.
> 
> 
> 
> [xml:lang=""]
> Is ‘no language information available’ suitable to express 
> ‘this is not a language’? My feeling is not.
> 
> If it were appropriate, there are some other questions to be 
> answered here. With HTML an empty string value for the lang 
> or xml:lang attribute produces a validation error.
> 
> It seems to me that the validator should not produce an error 
> for xml:lang=”". It needs to be fixed.
> 
> I’m not clear whether the HTML DTD supports an empty string 
> value for lang. If so, the presumably the validator needs to 
> be fixed. If not, then this is not a viable option, since 
> you’d really want both lang and xml:lang to have the same values.
> 
> [und]
> Would the description ‘undetermined’ fit this case, given 
> that it is not a language at all? Again, it doesn’t seem 
> right to me, since ‘undetermined’ seems to suggest that it is 
> a language of some sort, but we’re not sure which.
> 
> [zxx]
> This seems to be the right choice for me. It would produce no 
> validation issues. The only issue is perhaps that it’s not 
> terrible memorable.
> 
> Thoughts?
> 
> RI
> 
> 
> [1] http://www.w3.org/International/tutorials/language-decl/
> 
> [2] http://www.w3.org/TR/REC-xml/#sec-lang-tag
> 
> [3] http://www.iana.org/assignments/language-subtag-registry
> 
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>  
> http://www.w3.org/People/Ishida/
> http://www.w3.org/International/
> http://people.w3.org/rishida/blog/
> http://www.flickr.com/photos/ishida/
>  
> 
> --
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.446 / Virus Database: 268.18.10/720 - Release 
> Date: 12/03/2007 19:19
>  
> 
> 

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 268.18.16/729 - Release Date: 21/03/2007 07:52
 

Received on Thursday, 22 March 2007 14:17:37 UTC