- From: Misha Wolf <misha.wolf@reuters.com>
- Date: Sun, 23 Feb 1997 12:33:56 +0000 (GMT)
- To: meta2 <meta2@mrrl.lut.ac.uk>, www-international <www-international@w3.org>, Unicode <unicode@unicode.org>
Robin Cover wrote (to the meta2 list): > >It's my impression that the HTML DTD (Wilbur and Cougar 3.2) intends that >language-specific processing via the attribute LANG="" will honor the >containment hierarchy of the document, as though the declaration >were something like: > ><ENITTY (%contentElements;) lang NAME #INHERITED > > >although, of course, SGML has no keyword "#INHERITED" that has the >effect of telling the parser/application to apply the attribute >value to subelements. TEI has this keyword, and some other apps >do as well. It would simply mean that any element which has a >language attribute "inherits" the value from the parent if it's >not set on that element type in the instance. Works great in >principle. (TEI's #INHERITED is actually %INHERITED, and resolves >to #IMPLIED, since to have created a new keyword would have taken >TEI out of the conformance boundaries within which it wanted to >play. It has to be an application convention as of now -- but feel >free to write to your ISO rep and ask for this to get fixed!) > >It's amazing that SGML does not have this mechanism, given the >rigorous hierarchical nature of element structure. > >If HTML (I18N) is intended to work in some way *other than this*, >could someone please let me know, and provide a reference URL? It is, if I am not mistaken, as you describe. I haven't (yet) read Cougar, and am relying on RFC 2070. <... LANG=xx> is inherited from outer level components and is overriden by a <... LANG=xx> on nested components. Additionally, <SPAN LANG=xx> may be used to associate a language with, say, a few words. This is terminated by </SPPAN>. >Thanks. > >Thanks, too, Misha, for keeping the I18N concerns in front of the >METADATA community. It's a pleasure :-) >There's a strong move toward Unicode, but as >we all know, Unicode does not of itself get the job done. The >LANG="" attribute of HTML 3.2 helps a lot. Indeed, as you imply, Unicode encodes characters, not languages. The use of Unicode does not remove the need for language tagging, any more than does the use of ISO 8859-1. >Robin >------------------------------------------------------------------------- >Robin Cover Email: robin@acadcomp.sil.org >6634 Sarah Drive >Dallas, TX 75236 USA >>> The SGML Web Page <<< >Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html >Tel: +1 (972) 708-7346 (w) >FAX: +1 (972) 708-7380 >=========================================================================
Received on Sunday, 23 February 1997 07:32:41 UTC