W3C home > Mailing lists > Public > www-international@w3.org > January to March 2008

RE: New Working Group Note: Best Practices for XML Internationalization

From: Tex Texin <tex@yahoo-inc.com>
Date: Fri, 15 Feb 2008 11:50:03 -0800
Message-ID: <012AB2B223CB3F4BB846962876F47217D1FC94@SNV-EXVS08.ds.corp.yahoo.com>
To: "Richard Ishida" <ishida@w3.org>
Cc: "WWW International" <www-international@w3.org>

Hi Richard, Yves, Felix, Jony, Jeremy,

It's a bit of chicken and egg- if the W3C recommends markup, then the editor vendors are disincented to provide better control code management.

The issues you raise with managing the scope across the document perhaps reflect the inadequacy of markup design for rtl languages.
But most of the scope management issues are easily resolved with css anyway.

The representation of text should be standardized. I shouldn't have a string in an attribute that has to be written one way and a string in a table cell that is written another. I should be able to extract a string from a database and have it work equally well regardless of the context. I should be able to search for the string with a standardized or normalized representation. 

For higher levels of page layout and flow control, markup is quite appropriate.
But at the level of simply representing a text string, the fact that we have to say follow the recommendation, except in places where you can't, such as attributes, highlights that the recommendation is misguided. 

Using control codes would provide a standardized representation (for string level) and would work in both markup and in plain text, would simplify search, and would offer a wysiwyg view for anyone using an RTL-capable editor.

It would simplify implementation since the need to have code that exchanged control codes for markup depending on the context would be eliminated.

I understand that the authors were following the W3C/Unicode recommendation and I respect both that and the need for due process to change things.

If we are going to include the reference in a best practices document, perhaps the issues with following the recommendation should also be cited.

A separate effort to have the recommendation reconsidered should be initiated.
As for the recommendation being recently updated, it is true, but it was revised to include mention of new unicode characters not as a review of the entire content.

We should set the bar for a best practices document to be something that implementors can trust. While our recommended support for RTL is workable it isn't optimal or best and I think best practice is in fact to deviate from the recommendation.


-----Original Message-----
From: Richard Ishida [mailto:ishida@w3.org] 
Sent: Thursday, February 14, 2008 5:08 AM
To: Tex Texin
Cc: 'WWW International'
Subject: RE: New Working Group Note: Best Practices for XML Internationalization

Hi Tex,

I have reflected on a number of occasions whether it would make sense to use control codes rather than markup, and I'm not convinced that it would.

There are, of course, places where you have no choice (ie. attribute values and elements that support only plain text - though the best practices document also tries to convince schema authors to avoid defining those). But have you ever tried creating a page based solely on control codes for an RTL language? You can't rely on putting dir="rtl" in the <html> tag and then just tweaking where needed - you need to put (invisible) control codes in every block element.  Managing the scope across the document becomes very difficult. You also have problems getting table columns and numbered lists to switch around based on the directional context.  You also have to set right alignment for the blocks with a default context of right-to-left - which is actually likely to involve you having to mark them up much as you would with bidi markup anyway.  Basically, I don't think you are any better off.

The problem you seem to have encountered relates, it seems to me, to the editor support for bidi text, rather than the browser support.

I think it would be better to complain to the editor implementers that they should start to recognize bidi tags and control code escapes and be more intelligent about interpreting them in order to solve that problem, rather than make people use control codes.  (Note btw that if you try to make the control codes visible using NCRs or entities you run into the same issues with editors.)  I think markup is still problematic because the technology has *not* caught up with user's needs - not the browser technology, but the editing technology.


[1] http://www.w3.org/International/questions/qa-bidi-controls

Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

> -----Original Message-----
> From: www-international-request@w3.org [mailto:www-international- 
> request@w3.org] On Behalf Of Tex Texin
> Sent: 14 February 2008 07:32
> To: Felix Sasaki; WWW International
> Subject: RE: New Working Group Note: Best Practices for XML 
> Internationalization
> Thanks Felix.
> I appreciate the hard work that goes into a document like this and the 
> difficulty of the task.
> At the same time, I am skeptical about the term "best practices".
> For starters, we are long overdue to get rid of the idea we need 
> markup for bidi control when unicode controls can do the same thing, 
> and work in contexts where bidi markup cannot go.
> When editors did not support bidi, it made sense to make it visible 
> and the results would be seen in a browser.
> As editors today support bidi, it makes little sense to not be 
> WYSIWYG, and see the results as you edit, as opposed to inserting bidi 
> markup and having to switch feom edit to browser and back to 
> understand how the text is impacted.
> The joint Unicode-W3C standard should be revised to reflect a more 
> sensible practice.
> We should make sure best practices documents are updated to reflect 
> modern tools and capabilities and not just repeating folklore.
> How do we achieve giving these recommendations greater scrutiny and 
> considering alternatives and validating the practices?
> I know the team is not staffed sufficiently to do this alone. We need 
> to find a way to address this though.
> tex
> -----Original Message-----
> From: www-international-request@w3.org [mailto:www-international- 
> request@w3.org] On Behalf Of Felix Sasaki
> Sent: Wednesday, February 13, 2008 10:51 PM
> To: 'WWW International'
> Subject: New Working Group Note: Best Practices for XML 
> Internationalization
> The Internationalization Tag Set Working Group has published the Group 
> Note "Best Practices for XML Internationalization".
> http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/
> This document provides a set of guidelines for developing XML 
> documents and schemas that are internationalized properly. Following 
> the best practices describes here allow both the developer of XML 
> applications, as well as the author of XML content to create material 
> in different languages.
> Many thanks to the people from this list who contributed to the 
> document with their comments.
> Felix
Received on Friday, 15 February 2008 19:53:06 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:29 UTC