RE: Feedback on Authoring Techniques for XHTML & HTML Internationalization

Hello Manuel,

Thank you for taking the time to provide this feedback.  Here are my initial reactions, without having discussed with the group:

Preferred proposal:

[1] I wonder if perhaps your assumptions, maybe based on your previous work on 'primary language', has interfered a little with our explanations.  We try to draw attention to the idea that there are two distinct things going on when declaring language:
a. standing back and describing the document as a whole - this is document metadata - and is perhaps more like describing the intended users of the document (for this we appropriated the term 'primary language', and defined it to be such - whereas previous usage did not have a clear definition, and was not necessarily used for this purpose alone.  Please treat this as a new definition of the term 'primary language', and not an extension of previous usage.)
b. describing the language of a specific *range* of text, for which there needs to be a declaration at the very outset of the document to handle the text that appears there (and note that in this context, multiple language values are completely inappropriate - a voice browser or spell checker needs to know exactly which language it is currently dealing with).  For this type of declaration we invented the term 'text processing language'. Note also that using language attributes on the html element is entirely consistent with the way to label a part of a document where the language changes.

Proposing a single declaration point mixes these concepts in a way that will extend the current confusions that we were trying to get around in the first place.


[2] "The text in the title must be language neutral."  I'm not sure why, if there's only a single language.


[3] "meta element with the attribute http-equiv is proposed because it is the only mechanism".  Although one could say that theoretically declaring in the meta element is equivalent to declaring in the http header Content-Language, that is not the case in practise. I find this statement, coupled with the following that "servers should include the primary language(s) in the Content-Language field" confusing.  Those are two mechanisms.  The meta is not created automatically.

Note also that in practise non of the user agents we tested actually used the information in the meta element to establish language - all of them used the declaration in the html element, though.  A rule like this requires all user agents to change their behaviour if it is to be successful.


[4] Why should text processors consider the primary language the default text processing language? If it becomes undefined when several are declared, this seems a poor strategy.


[5] Your example of multiple language text marked up in <title> cannot be done currently because HTML will not allow markup in that element.  I do not see that happening until we get to XHTML 2.0.  So this is not workable for existing HTML/XHTML documents.  That's a really big problem. (Note, by the way, that the candidate for 'foo' is 'span'. That's standard practise.)


Secondary proposal:

[6] Again, this seems to operate on the premise that there should be only one language declaration. I do not see any justifications for this in your proposal.


[6] "It is not proposed to use the xml:lang attribute."  There are good reasons for using both in hybrid XHTML 1.0 documents - so you can read in user agents as HTML, but process as XML. I do not want to debate the merits and demerits of using XHTML served as text/html, but it is widely done, and I do not see this as a practical requirement.  It is irrelevant for HTML and for XHTML 1.1+ and XML.


Proposal for XML

[7] Note that your proposal for multiple values for the xml:lang attribute is currently not supported by XML, and is unlikely to be supported in the near future.  It is therefore ruled out for a large amount of existing data.  (It's not clear from your proposal whether you are proposing usage or changes to the XML standard with this document.  If the latter, I don't see any convincing arguments to change in your document.)

Hope that helps,
RI





============
Richard Ishida
W3C

contact info:
http://www.w3.org/People/Ishida/ 

W3C Internationalization:
http://www.w3.org/International/ 

Publication blog:
http://people.w3.org/rishida/blog/
 
 

> -----Original Message-----
> From: mtcarrascob@yahoo.com 
> [mailto:mtcarrascob@yahoo.com
> Sent: 28 October 2004 15:32
> To: ishida@w3.org; duerst@w3.org
> Subject: Feedback on Authoring Techniques for XHTML & HTML 
> Internationalization
> 
> Richard, Martin,
> 
> The feedback is in
> http://europa.eu.int/comm/translation/engineering/primary_lang
> uage_en.pdf
> 
> I posted to it to the list www-i18n-comments from a yahoo 
> account (trying to minimize spam). Please send it to the list 
> in case of failure. Please do not include this email address 
> in the posting; anyhow, it is in the document.
> 
> Regards
> Tomas
> 

Received on Thursday, 28 October 2004 16:21:53 UTC