Re: Language labelling

Robin Cover wrote (to the meta2 list):
>
>It's my impression that the HTML DTD (Wilbur and Cougar 3.2) intends that
>language-specific processing via the attribute LANG="" will honor the
>containment hierarchy of the document, as though the declaration
>were something like:
>
><ENITTY (%contentElements;) lang NAME #INHERITED >
>
>although, of course, SGML has no keyword "#INHERITED" that has the
>effect of telling the parser/application to apply the attribute
>value to subelements.  TEI has this keyword, and some other apps
>do as well.  It would simply mean that any element which has a
>language attribute "inherits" the value from the parent if it's
>not set on that element type in the instance.  Works great in
>principle.  (TEI's #INHERITED is actually %INHERITED, and resolves
>to #IMPLIED, since to have created a new keyword would have taken
>TEI out of the conformance boundaries within which it wanted to
>play.  It has to be an application convention as of now -- but feel
>free to write to your ISO rep and ask for this to get fixed!)
>
>It's amazing that SGML does not have this mechanism, given the
>rigorous hierarchical nature of element structure.
>
>If HTML (I18N) is intended to work in some way *other than this*,
>could someone please let me know, and provide a reference URL?

It is, if I am not mistaken, as you describe.  I haven't (yet) read Cougar, 
and am relying on RFC 2070.  <... LANG=xx> is inherited from outer level 
components and is overriden by a <... LANG=xx> on nested components.  
Additionally, <SPAN LANG=xx> may be used to associate a language with, say, 
a few words.  This is terminated by </SPPAN>.

>Thanks.
>
>Thanks, too, Misha, for keeping the I18N concerns in front of the
>METADATA community.

It's a pleasure :-)

>There's a strong move toward Unicode, but as
>we all know, Unicode does not of itself get the job done.  The
>LANG="" attribute of HTML 3.2 helps a lot.

Indeed, as you imply, Unicode encodes characters, not languages.  The use of 
Unicode does not remove the need for language tagging, any more than does the 
use of ISO 8859-1.

>Robin
>-------------------------------------------------------------------------
>Robin Cover                    Email: robin@acadcomp.sil.org
>6634 Sarah Drive           
>Dallas, TX  75236  USA            >>> The SGML Web Page <<<
>Tel: +1 (972) 296-1783 (h)     http://www.sil.org/sgml/sgml.html
>Tel: +1 (972) 708-7346 (w)
>FAX: +1 (972) 708-7380
>=========================================================================

Received on Sunday, 23 February 1997 07:32:41 UTC