test cases Re: xml:lang="" and lang=""

Hi,

Reading the Editor's Draft 21 August 2008 about xml:lang="" and  
lang="", I was trying to figure out different test cases.
http://dev.w3.org/html5/spec/

The most relevant part for these tests (in text/html) is:

     If both the xml:lang attribute and the lang attribute
     are set on an element, user agents must use the xml:lang
     attribute, and the lang attribute must be ignored for
     the purposes of determining the element's language.

This is an improvement on the previous version which was giving  
different results depending on the mime-type. I put the results for  
the previous WD so that it is clearer for people [10 June 2008 result]

Comment: Ian there are three mentions of RFC 3066 in the editors draft  
for meta "Content Language State", hreflang and lang. Could you fix it  
to RFC 4646?

XML processing is defined by
http://www.w3.org/TR/2006/REC-xml-20060816/#sec-lang-tag
which refers RFC 3066

This would need to be tested in browsers.


Case 1 - <html xml:lang="" and lang="">
     1.a text/html
         -> language unknown [10 June 2008 unknown]
     1.b application/xhtml+xml
         -> language unknown (see note #2)

Case 2 - <html xml:lang="fr" and lang="ja">
     2.a text/html
         -> language fr  [10 June 2008 ja]
     2.b application/xhtml+xml
         -> language fr (see note #1)

Case 3 - <html xml:lang="fr" and lang="">
     3.a text/html
         -> language fr  [10 June 2008 unknown]
     3.b application/xhtml+xml
         -> language fr (see note #1)

Case 4 - <html xml:lang="" and lang="ja">
     4.a text/html
         -> language unknown [10 June 2008 ja]
     4.b application/xhtml+xml
         -> language unknown (see note #2)

Case 5 - <html xml:lang="fr" and lang="fr-CA">
     5.a text/html
         -> language fr [10 June 2008 fr-CA]
     5.b application/xhtml+xml
         -> language fr (see note #1)

Case 6 - <html xml:lang="fr,de" and lang="fr,de">
     6.a text/html
         -> language fr (see note #3) [10 June 2008 fr]
     6.b application/xhtml+xml
         -> undefined. I don't know what the
            implementations should do (see note #4)


notes:
#1 The local value overrides transport protocol information.  
interesting, that seems to violate HTTP precedence principles. There  
would be interesting tests to do with http declared in meta or  
directly configured on the server.

#2 except if the value is specified through external transport  
protocol (ex: http or mime)

#3 collect a sequence of characters
    http://dev.w3.org/html5/spec/#collect (Editor's draft)

#4 The last RFC for indentiying languages is RFC 4646
    http://www.ietf.org/rfc/rfc4646.txt
    There are interesting requirements on
    * buffer sizes for the values of languages tags
      and what should be done in case of errors.
    * language tags case -> they are case-insensitive
      fr-CA and fr-CA and fR-Ca are strictly equivalent
      (could be another test case)
    * Classes of Conformance for processors of the values
      well-formed and validating
      BUT unfortunately it doesn't say what it should do
      when the data is non conformant, aka rejecting
      with an unknown value? warning? stop processing? etc.



-- 
Karl Dubost - W3C
http://www.w3.org/QA/
Be Strict To Be Cool

Received on Thursday, 21 August 2008 03:27:28 UTC