Re: Testing RFC 4646 values in markup languages

Hi Karl,

Karl Dubost wrote:
>
> Le 17 mars 2008 à 17:52, Felix Sasaki a écrit :
>>> In XML, it is a bit tricky, it seems. By XML spec
>>
>> why is it tricky?
>
> mwarf. sent it too quick.
>
> In Extensible Markup Language (XML) 1.0 (Fourth Edition), I read about 
> [language identification][2],
>
>     "A special attribute named xml:lang may be
>     inserted in documents to specify the language
>     used in the contents and attribute values of
>     any element in an XML document. In valid
>     documents, this attribute, like any other,
>     MUST be declared if it is used."
>
> Initially I thought that any vocabulary could have xml:lang anywhere 
> but the rule seems to be a bit more refined.
>
> 1. It can be on any elements.
> 2. For documents meant to be validated, it MUST be declared (schemas, 
> dtd?)

"MUST be declared" depends on the schema language, see
http://www.w3.org/International/questions/qa-when-xmllang :
      XML DTDs require that any element that uses xml:lang as an 
attribute must declare it in the DTD
      XML Schema requires that the xml namespace be declared and 
imported before using xml:lang (and other xml namespace values)
      RELAX NG predeclares the xml namespace, as in XML, so no 
additional declaration is needed.
Since HTML 5 refers to XML itself, not XML Schema / RELAX NG, I would 
say only the XML DTD case applies here.

>
> I see later on in Attribute-list declaration:
>
>     "At user option, an XML processor MAY issue
>     a warning if attributes are declared for an
>     element type not itself declared, but this
>     is not an error. The Name in the AttDef rule
>     is the name of the attribute."
>
> in HTML 5 spec the xml:lang is [defined][4] by 3.4.3. The lang (HTML 
> only) and xml:lang (XML only) attributes
>
>     "The xml:lang  attribute is defined in XML. [XML]"
>
> It is part of the global attributes definition
>
>     "The following attributes are common to and may
>     be specified on all HTML elements (even those
>     not defined in this specification):"
>
>
> which is not following what HTML 4.01 was doing before.
>
>     lang     on All elements but APPLET, BASE,
>     BASEFONT, BR, FRAME, FRAMESET, IFRAME, PARAM, SCRIPT

HTML 4.01  is talking about the lang attribute, but not xml:lang. So I 
think the above statement "The following attributes are common ..." does 
not contradict HTML 4.01. . However, it may make sense to align the 
behavior of lang and xml:lang: that is, to change the requirements for 
lang and let it appear at any element, or to change the requirements for 
xml:lang so that it may not appear on APPLET, BASE etc. That looks like 
a question for the HTML and i18n core WGs, and after it is decided it 
would be a question for the conformance checker. What do you think?


>
>
> html5 served as application/xhtml+xml
>
> <html xmlns="http://www.w3.org/1999/xhtml">
>  <head> <title>example of xhtml</title> </head>
>  <body>
>     <p>foo <br xml:lang="en"/> bar</p>
>  </body>
> </html>
>
>
> What a Conformance checker should do here?

it should implement the currently different requirements for lang and 
xml:lang and not complain, or - if the HTML WG / i18n folks agree on 
changing the xml:lang behavior as described above - it should complain.

Felix

Received on Tuesday, 18 March 2008 07:48:35 UTC