- From: Yves Savourel <ysavourel@translate.com>
- Date: Fri, 08 Mar 2002 11:32:22 +0900
- To: www-i18n-workshop@w3.org
Hello everyone, To follow up with Martin's call for more input, here are a few thoughts on Localizability. I also hope the six people who indicated at the Workshop they would have an interest in being involved with this effort will take the time to post their ideas, comments, and any other useful input. 1. Why Localizability is Relevant to the W3C i18n Group * Localizability is one of the aspects of internationalization. This work is done at development time, before the material is localized. It is intimately related to localization practices and tools, but it need to be done upstream. In short: Ensuring that your Web material can be localized efficiently is an internationalization issue not a localization issue. * Like guidelines, like language/locale identification, like education and outreach, localizability is an horizontal aspect of the Web domain: the issues are the same across all Web applications and can be usually addressed with very similar solutions. * Indicating localization properties falls in the same category as identifying languages or encoding: it pertains to the source material, not just the localized one. * In general, localization information is entered by one category of users (authors and developers) and used by another one (localizers and translators). There is a need for a connection between these two user domains. Most of the time both look at the W3C recommendations as the best source for implementation requirements for their tools/processes. For example, an author will use xml:lang for language identification in XML, a localizer will understand it as well and both will use this piece of W3C recommendation to communicate seamlessly. It seems reasonable to think the same could happened for localization properties. * The W3C is driving the specifications of Web material such as HTML and XML. It seems reasonnable to think these specifications include standard mechanism to carry localization information, the same way XLink, XInclude, XBase, XForms, lang or xml:lang provide common mechanisms for other types of information. * Internationalization of data that are not "normal" content (scripts, grammars, queries, etc.) is often weak, having a fall back mechanism using localization directives in such type of data would help a lot in the localization process. 2. Key Players They are several categories of players that should be involved in localizability efforts: * Developers of authoring tools for Web materials (e.g. the developers of DreamWeaver, FrontPage, XMetal, XMLSpy, etc.) * Authors and developers of Web content. * Developers of localization tools. (e.g. the developers of DejaVu, SDLX, Transit, TagEditor, Wordfast, etc.) * Localizers, translators, and terminologists. There are also a few organizations that could contribute: * OASIS - the Organization for the Advancement of Structured Information Standards. * OASIS is the home of XLIFF (XML Localisation Interchange File Format), the emerging standard for carrying localizable data in many cases during the localization process. The information an XLIFF document contains should match very closely the localization information provided in the original material. Whatever localization information is defined in the original material should be able to be carried within an XLIFF document as well. * LISA - the Localisation Industry Standard Association. * LISA develops standard related to localization, especially exchange formats such as TMX and TBX. There is are many people in LISA groups such as OSCAR with a great deal of experience who could help in working on localizability issues. 3. Work Items A first step would be to define what localization information could be needed with Web material. For example: translatable vs. non-translatable, term for glossary, note for translators, etc. Some preliminary work, also related to guidelines, has been done in the informal ITS group and can be seen at: <http://groups.yahoo.com/group/lisa-its/files/ITS-Requirements/ITS-Requireme nts.html>http://groups.yahoo.com/group/lisa<http://groups.yahoo.com/group/li sa-its/files/ITS-Requirements/ITS-Requirements.html>-its/files/ITS-Requireme nts/ITS-Requirements.html. A second step would be to define how these localization properties can be specified: * At the format level (i.e. as a property of the given type of document). For example: in a special rule file, or directly in schemas, etc. * At the document level (i.e. as localization directives embedded in each document when needed). For example: an element <para> may be specified as translatable at the format level, but in some occurrences, a given sentence in a given <para> element might need to remain in the source language. This would be done, presumably, through a standard namespace useable in any XML documents, similarly, for instance, to XLink. A third step would be to define a mechanism to have an identical way of marking up non-XML material. This could help for example in the localization of script, CSS style-sheets, etc. For example: a simple prefix convention in comments and the same properties and syntax as the XML namespace could bring localization directives in any format supporting comments. 4. Some Examples One example is better than hundred words. So here are some illustrations of how the items described above could be addressed. Those are obviously just example of *possible* ways, among others, to provide localization information. [These examples are based on ideas emitted during discussions in the ITS group, and coming, among others, from Richard Ishida and Shigemichi Yazawa.] 4.1. Definition of Localization Properties One way to define localization properties for a given vocabulary is to use a specialized rule file (itself an XML application hopefully) that describe the default properties for both attributes and elements, and then specify the exceptions, using XPath to identify the type of nodes where they should be applied. The following is an example of a source material. Only the bolded text is translatable: <?xml version="1.0" ?> <dialogue xml:lang="en-gb"> <rsrc id="123"> <component id="456" type="image"> <data type="text">images/cancel.gif</data> <data type="coordinates">12,20,50,14</data> </component> <component id="789" type="caption"> <data type="text">Cancel</data> <data type="coordinates">12,34,50,14</data> </component> </rsrc> </dialogue> The following is an example of the rule file that specify localization properties for the type of file like the example above: <?xml version="1.0" ?> <locprop version="0.1"> <rules name="Example1" root="dialogue"> <element-defaults localize="no"/> <attribute-defaults localize="no"/> <rule item="//component[@type='caption']/data[@type='text']" localize="yes"/> </rules> </locprop> 4.2. Localization Directives in XML The following is an example of how localization directives (in bold) could be used in an XHTML document: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> <html xmlns="<http://www.w3.org/1999/xhtml>http://www.w3.org/1999/<http://www.w3.o rg/1999/xhtml>xhtml" xml:lang="en" lang="en" xmlns:loc="urn:the-localization-directives-standard"> <head> <title loc:id="100">Title</title> </head> <body> <h1 id="101">Introduction to <loc:span term="yes">Document Management</loc:span></h1> <p id="102">Our company, <loc:span localize="no">Infinite Wisdom Inc.</loc:span>, provides quality courses on how to manage your documentation.</p> </body> </html> Obviously, marking up a document should be done only for exception. For example of a specific element of a vocabulary is always a term, or always not to localize, or has always a specific length restriction, such information should be defined at the vocabulary level rather than in each document. 4.3. Localization directives using comments The following @charset "iso-8859-1" *:lang(en) { font-family: Arial; } /*loc:note Text automatically placed in front of a note element. */ note:before { "Note:"; color: red; } note { font-weight: bold; } --end-- Cheers, -yves
Received on Thursday, 7 March 2002 23:37:16 UTC