- From: Karl Dubost <karl@w3.org>
- Date: Fri, 24 Oct 2003 17:57:13 -0600
- To: www-i18n-comments@w3.org
Hi, this is a few comments with regard to your 1st WD. ATeXHI 1.0 or Babel Scribe 1.0 Authoring Techniques for XHTML & HTML Internationalization 1.0 First of all, thank you very much for this work it was much needed. I hope you will have success and good reviews for each of your version. * QA Spec Guidelines - http://www.w3.org/TR/qaframe-spec The QA Spec Guidelines are entering in CR phase, which is an implementation phase for the QA WG. It seems that it will be a wonderful opportunity for both WG, GEO and QA, to implement these guidelines and for the QA WG to help and create tools when it's needed. This following review is not a review against QA Spec Guidelines I have discussed with Richard Ishida on IRC and he told me that some of the verbiage was repeating the same principles along the document. The document to be read by the outline. I Would encourage the editors to write atomic statement for each feature and to not repeat the same verbiage BUT to point to these atomic statement from different outlines. It will be like having modules addressing some problems, and profiles collecting a set of modules or features applied to specific problems or readers. It will have the advantage for the editor to be easier to maintain as well and less confusing for the reader in certain circumstances. * Abstract You limit your scope to XHTML 1.0/HTML 4.01. XHTML 1.1 is already a specification and includes Ruby, which is an interesting technology for the Web and I18N. XHTML 2.0 is in development it may be the opportunity to input more I18N stuff in XHTML and when XHTML 2.0 does not address certain I18N issues to put them in this document. * Status """These are techniques that need to be addressed from the start of content development if unnecessary costs and resource issues are to be avoided later on.""" It's never too late to improve a Web site or a document. It might be benefitial to point out that if the site does not respect simple principles of I18N, it can still improve Step by Step the overall quality. See http://www.w3.org/QA/2003/03/web-kit where we mentionned I18N * 1.3 Standards addressed """ote that XHTML source can be served as XML (using MIME types application/xhtml+xml, application/xml or text/xml) or HTML (using the MIME type text/html).""" It might happen in the future that text/xml be deprecated. There's a lot of discussion around that. It's at risk. * 1.4 User agents addressed. Netscape 7 is a frozen/dead product and will not be developed anymore, I would encourage to focus on Mozilla more than Netscape. If you want that your document fresh and evolving with tools, you may want to choose to compatibility charts outside of your main document. * 2.1 Internationalizing the page header The recommendation is good but your example is not very good. If you serve your document as text/html, you do not need the XML declaration <?xml version="1.0" encoding="UTF-8"?> And if you serve it as application/xhtml+xml, there's no need to put the xml declaration if your document is utf-8 and utf-16, it's even not recommended, because IE 6 Windows have problem with the xml declaration and pass in quirks mode when it's here. It's good to encourage utf-8, and there's an incentive to do it by saying that if you use utf-8, you don't need to put the xml declaration and therefore IE 6 will be friendly with you. """In case of conflict, the Content-Type charset declaration and the XML declaration have precedence over the meta charset statement, according to the HTML 4.01 and XHTML 1.0 specifications. [Ed. note: Is this true in practise? esp wrt IE?]""" See CUAP - http://www.w3.org/TR/cuap. There is the precedence order. """Use meta charset declarations as early as possible in the head element.""" When the browser does not in the http headers the encoding, it will be necessary to parse the begining of the document to get the encoding information. As such, it's indeed preferable to have it at the start so the user agent will be able to display with the correct encoding. Though it might be useful to test or ask to vendors when do they stop parsing the header to find this information. """For HTML use the lang attribute, and for XHTML use the lang and xml:lang attributes in the html tag. """ There's an incentive to use XHTML over HTML by the fact of being able to smoothen your transition to XHTML 1.1 or XHTML 2.0. In XHTML 1.0, you can use only xml:lang if you wish and you will have no problems to switch to XHTML 1.1 or XHTML 2.0 where xml:lang is the only possible attribute. One of the reason of using xml:lang or lang attributes in a document, is the behaviour of CSS rules. For example, in IE5 Macintosh if you put a "q" element for citation, the quotes will be different depending on the wrapping language. « blabla » in french, “ blabla ” in english, etc. You have also rules of selection in CSS 2 depending on the language too. Another good point to make, if the document is read by a translating agent (automatic translation), it will not have to guess the main language of the language by an heuristic, therefore performance improvement for processing it. The meta statement must be compatible with the html element, though it's not mandatory. I guess the html element should have precedence on the meta element. * 2.2 International Layout considerations right, left and before, after An interesting issue which appeared when we designed a QA stylesheet for right/left direction languages. We have small red arrows in the menu and for languages left to right the arrow points to the right. Luckily enough the arrow was specified with a before CSS structure and was in the CSS and not in the HTML with an img element so we have been able to create another stylesheet for right to left languages. It has been less painful than having thousands of pages to modify. Though it's interesting to understand that a simple arrow may have internationalization problems. * 3.1 Choose a page encoding Choose UTF-8 or another Unicode encoding for all content. - Give the list or a reference to a list of Unicode encodings """* Unicode (UTF-8) forms will be easier to migrate to XForms.""" You can add for the reasons I gave before: * Unicode (UTF-8) forms will be easier to migrate to XHTML 1.1/XHTML 2.0 """If you don't use a Unicode encoding, select an encoding that best supports the languages / characters to be included in the page text.""" This is not testable per se. You might recommend: Use an encoding that supports the languages/characters included in the page text. """Check that user agents (all agents that must render the page) adequately support the page encoding that you have selected. If not, you might need to use a more widely supported encoding to achieve an adequate degree of user agent support.""" It contradicts in a sense a principle of accessibility and of the Web which says whatever your user agent you should be able to access the content. Though this said, it doesn't solve the problem. I would not encourage people to do browser sniffing too, because it challenges It's the same for the next technique. """Use character sets and encodings that will be accessible and common to your users.""" when you recommend such techniques, you have to moderate it by explaining the constraints/difficulties it might create to other users. * 3.2 Specifying a page encoding """Where practical, declare the page's character encoding by setting the charset parameter in the HTTP Content-Type header.""" Not where practical, do that all the time. Each time you have the opportunity to serve your document with the right encoding in the HTTP header, just do it. It has the benefit for the user agent to not have to guess or parse the begining of the HTML document to know how to display it. It's not incompatible with specifying inside the document for the reason you gave, saving locally, etc. You may give an example for httpd.conf and/or .htaccess for Apache and an example for Jigsaw Apache httpd.conf and .htaccess AddCharset utf-8 .html you can also do things like <FilesMatch "/somewhere/europe/*.html"> AddCharset iso-8859-1 .html </FilesMatch> Ask to Yves Lafon on the method for Jigsaw. """For XHTML served as text/html, where practical use an XML declaration with an encoding attribute.""" No. When XHTML is served as text/html the XML declaration becomes completely irrelevant and as I said gives problem to IE6. And you explain it just after. The visual checking is not a good recommendation. :) even if it's done often. * 4.1 Choosing & specifying fonts """Do not use <font> tags - use CSS styles instead.""" I see in the Ed Note """Ed. note: Describe the evils of using <font> to cheat on the charset and represent other scripts.]""". It would be good to give techniques and examples how the Webmaster can switch from the use of font to the use of other techniques. """Always use the serif and sans-serif fallbacks""" to add "In the font property in CSS". 5.3 Specifying the language of a link destination """Use the hreflang attribute on the a element.""" It is supported by CSS :)))) You should read my entry about it. http://www.la-grange.net/2002/09/03#hreflang (french) CSS rule for it /* display of the language you linked to */ a[hreflang]:after { content: " [" attr(hreflang) "] "; } What are the benefits of that? 1) strong usability benefits, the user will know browsing your Web site what is the language of the ressource your are linking to. Imagine you are in a document writtent in french and you link to a reference in english, but some of your readers do not know english at all. They will not have to follow the link to discover afterward they can't read it. They save time, and bandwidth. 2) It will be good if the I18N activity review the CUAP note and add comments to it or new checkpoints. Why? Because you might encourage or recommend behaviours of user agents. For example, you might recommend to a user agent which is an automatic translator to respect the attributes "lang" and "xml:lang" in a document, so it doesn't translate things which should not (like trying to translate french to french sometimes... silly. and to use with intelligence the hreflang attribute. It means in a context where you have this attribute the automatic translator will know beforehand the main language used if the user follow the link and will give the possibility to translate adequatly. Also for indexing search engines like Google it has the benefits of knowing the language before to index it and so to be more effective in indexing the page. * Do not add dir="rtl" to the body tag. """According to the Microsoft article Authoring HTML for Middle Eastern Content, the following behaviors can only be expected in Internet Explorer 5 if the dir attribute is on the html element, rather than the body element.""" Specify which version of IE mac or windows? This is it for a first review ;) -- Karl Dubost - http://www.w3.org/People/karl/ W3C Conformance Manager *** Be Strict To Be Cool ***
Received on Friday, 24 October 2003 19:57:09 UTC