[LT-Web] Test Suite output

Forwarded on behalf of Mārcis (there is a subscription issue for the 
tests list) and putting Mārcis and his colleagues in CC. Implementers, 
please have a look at Mārcis' issue.

Thanks,

Felix

-------- Original-Nachricht --------
Betreff:  [Moderator Action] [LT-Web] Test Suite output
Datum:  Thu, 17 Jan 2013 14:47:39 +0000
Von:  Mārcis Pinnis <marcis.pinnis@Tilde.lv>
An:  Multilingual Web LT-TESTS Public LT-TESTS Public 
<public-multilingualweb-lt-tests@w3.org>
Kopie (CC):  Felix Sasaki <fsasaki@w3.org>, Pēteris Ņikiforovs 
<peteris.nikiforovs@Tilde.lv>, Andis Lagzdiņš <andis.lagzdins@Tilde.lv>



Hi everyone,

We (in Tilde) are working through the Test Suite and for Language 
Information we find the expected results not to represent what is said 
in http://www.w3.org/TR/its20/#language-information and 
http://www.w3.org/TR/its20/#datacategories-defaults-etc.

The first question:

The input in HTML example 1 is as follows:

<!DOCTYPE html>

<html lang="en">

   <head>

    <meta charset=utf-8>

    <link href="languageinfo1htmlrules.xml" rel="its-rules"/>

<title>EXAMPLE</title>

   </head>

   <body>

    <p>The motto of Québec is:

     <q>Je me souviens</q>

   .</p>

    <p>La devise du Québec est :

     <q lang="fr-CA">Je me souviens</q>

   .</p>

   </body>

</html>

The acompanying rules file defines the following:

<its:rules xmlns:its="http://www.w3.org/2005/11/its" 
xmlns:h="http://www.w3.org/1999/xhtml" version="2.0">

<its:langRule selector="/h:*" langPointer="@lang"/>

<its:langRule selector="//h:*" langPointer="@lang"/>

</its:rules>

The expected output is:

/html  lang="en"

/html/@lang   lang="en"

/html/head[1]

/html/head[1]/meta[1]

/html/head[1]/meta[1]/@charset

/html/head[1]/link[1]

/html/head[1]/link[1]/@href

/html/head[1]/link[1]/@rel

/html/head[1]/title[1]

/html/body[1]

/html/body[1]/p[1]

/html/body[1]/p[1]/q[1]

/html/body[1]/p[2]

/html/body[1]/p[2]/q[1] lang="fr-CA"

/html/body[1]/p[2]/q[1]/@lang lang="fr-CA"

However, our parser produces the following (and we tend to believe that 
this is correct!):

/html lang="en"

/html/@lang lang="en"

/html/head[1] lang="en"

/html/head[1]/meta[1] lang="en"

/html/head[1]/meta[1]/@charset lang="en"

/html/head[1]/link[1] lang="en"

/html/head[1]/link[1]/@href lang="en"

/html/head[1]/link[1]/@rel lang="en"

/html/head[1]/title[1] lang="en"

/html/body[1] lang="en"

/html/body[1]/p[1] lang="en"

/html/body[1]/p[1]/q[1] lang="en"

/html/body[1]/p[2] lang="en"

/html/body[1]/p[2]/q[1] lang="fr-CA"

/html/body[1]/p[2]/q[1]/@lang lang="fr-CA"

I marked the difference in red.

The language information data category specifies the following 
inheritance rules:

„Textual content of element, /including/ attributes and child elements”

This (as I understand it) means that everything within <html> is in 
English except the second <q> tag.

This issue is in all three HTML language information examples in the 
Test Suite.

A second question:

I find confusing the definition and parsing of the other two language 
information examples.

I understand that language information is added at a global level, 
however, the ITS 2.0 reference says:

Locally users are able to use |xml:lang|(which is defined by XML), or 
|lang|in HTML, or an attribute specific to the format in question (as in 
Example 51 <http://www.w3.org/TR/its20/#EX-lang-definition-1>).

After reading this sentence I have the understanding that the „lang” 
attribute is equally important for language information parsing than 
global rules (regardless of where they point).

That being said, is the expected output correct in the second and third 
examples (the first lang =”en” fragment was ignored in the expected output)?

Best regards,

Mārcis ;o)

Received on Thursday, 17 January 2013 18:49:06 UTC