[Bug 10152] [polyglot] i18n comment 5 : Mention lang and xml:lang from bugzilla@jessica.w3.org on 2011-01-21 (public-html-bugzilla@w3.org from January 2011)

From: <bugzilla@jessica.w3.org>
Date: Fri, 21 Jan 2011 10:52:01 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1PgEb3-0001CF-I9@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10152

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |

--- Comment #4 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-01-21 10:52:00 UTC ---
(In reply to comment #2)

> No XML parser will pick up on the http-equiv, regardless of what
> browsers do, 

SUMMARY: If it can be argued that it is incorrect of 'application/xhtml+xml'
browsers/parser to pick up the language from http-equiv, then I insist that we
require authors to use xml:lang/lang everytime http-equiv actually have an
effect. (See below.) But if it works the same in both XHTML and HTML, then the
polyglot specification can remain as it is.

PROPOSAL: If it is correct that there is a definite difference in XML and
HTML's handling of the HTTP-EQUIV content-language element, then the spec
should regulate the use of meta@content-language in polyglot documents.
Otherwise we would risk that HTML5-parsers would see language(s) that an
'application/xhtml+xml' parsers would not see. We need to establsih to which
extent you are correct. 

The simplest thing would be to say that polyglot documents are REQUIRED to have
the lang/xml:lang attributes _on the root element_ whenever the document has a
http-equiv content-language element that actually affects the language of the
document - unless there also is an HTTP header that specifies the same language
as HTTP-EQUIV specifies.

(This is what HTML5 says:  the http-equiv content-language element affects the
document language everytime  it contains a single language tag. Thus if it
contains zero or more than one language tag, then there is no document language
effect. The issue on what syntax that is legal in HTML5, has not be decided
yet, see http://lists.w3.org/Archives/Public/public-html/2010Jun/0569.html .
However, the rules how the language gets decided are not disputed. )

DISCUSSION:

(1)  as Eliot cited in comment  #1, what you say is is not true. Because, when
a Web browser is parsing a document in "application/xhtml+xml" mode, then that
Web browser *is* an XML parser, or what? 

(2) It is true that XML 1.0 does not mention the http-equiv content-language
element:
     But can one from this conclude that it is incorrect of a XHTML+XML parser
to pick up the language from the meta Content-Langauge tag? XML 1.0 says that
the XML parser may pick up the language also from HTTP and from MIME. I do
therefore not understand why it can't pick up the language from HTTP-EQUIV as
well. In particular, a compliant 'application/xhtml+xml' parser must be
expected to know what http-equiv means, no?
     I hope someone who is expert on both XML and HTTP can answer.

(3) Perhaps you are comparing content-language with Content-Type and HTML5's
meta@charset? I agree that that is a useful comparsion. However:
      The reason why META@charset and META@http-equiv content-type  literaly
doesn't make any sense in XML documents, is because, at that point when the
META@charset element is being processed, the encoding has already been decided.
Either via external protocol, or by reading the <?xml ?> declaration. Whereas
for language, that's a per-element issue.
      Thus, the reason why meta@charset doesn't work in XML is, to my mind,
different from why Content-Langauge eventaully shoudln't work.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Friday, 21 January 2011 10:52:03 UTC