Localizability - Notes

Hello everyone,

To follow up with Martin's call for more input, here are a few thoughts on 
Localizability.
I also hope the six people who indicated at the Workshop they would have an 
interest in being involved with this effort will take the time to post 
their ideas, comments, and any other useful input.



1. Why Localizability is Relevant to the W3C i18n Group

    * Localizability is one of the aspects of internationalization. This 
work is done at development time, before the material is localized. It is 
intimately related to localization practices and tools, but it need to be 
done upstream. In short: Ensuring that your Web material can be localized 
efficiently is an internationalization issue not a localization issue.
    * Like guidelines, like language/locale identification, like education 
and outreach, localizability is an horizontal aspect of the Web domain: the 
issues are the same across all Web applications and can be usually 
addressed with very similar solutions.
    * Indicating localization properties falls in the same category as 
identifying languages or encoding: it pertains to the source material, not 
just the localized one.
    * In general, localization information is entered by one category of 
users (authors and developers)  and used by another one (localizers and 
translators). There is a need for a connection between these two user 
domains. Most of the time both look at the W3C recommendations as the best 
source for implementation requirements for their tools/processes. For 
example, an author will use xml:lang for language identification in XML, a 
localizer will understand it as well and both will use this piece of W3C 
recommendation to communicate seamlessly. It seems reasonable to think the 
same could happened for localization properties.
    * The W3C is driving the specifications of Web material such as HTML 
and XML. It seems reasonnable to think these specifications include 
standard mechanism to carry localization information, the same way XLink, 
XInclude, XBase, XForms, lang or xml:lang provide common mechanisms for 
other types of information.
    * Internationalization of data that are not "normal" content (scripts, 
grammars, queries, etc.) is often weak, having a fall back mechanism using 
localization directives in such type of data would help a lot in the 
localization process.



2. Key Players

They are several categories of players that should be involved in 
localizability efforts:
    * Developers of authoring tools for Web materials (e.g. the developers 
of DreamWeaver, FrontPage, XMetal, XMLSpy, etc.)
    * Authors and developers of Web content.
    * Developers of localization tools. (e.g. the developers of DejaVu, 
SDLX, Transit, TagEditor, Wordfast, etc.)
    * Localizers, translators, and terminologists.
There are also a few organizations that could contribute:
    * OASIS - the Organization for the Advancement of Structured 
Information Standards.
    * OASIS is the home of XLIFF (XML Localisation Interchange File 
Format), the emerging standard for carrying localizable data in many cases 
during the localization process. The information an XLIFF document contains 
should match very closely the localization information provided in the 
original material. Whatever localization information is defined in the 
original material should be able to be carried within an XLIFF document as 
well.
    * LISA - the Localisation Industry Standard Association.
    * LISA develops standard related to localization, especially exchange 
formats such as TMX and TBX. There is are many people in LISA groups such 
as OSCAR with a great deal of experience who could help in working on 
localizability issues.



3. Work Items

A first step would be to define what localization information could be 
needed with Web material. For example: translatable vs. non-translatable, 
term for glossary, note for translators, etc. Some preliminary work, also 
related to guidelines, has been done in the informal ITS group and can be 
seen at: 
<http://groups.yahoo.com/group/lisa-its/files/ITS-Requirements/ITS-Requireme 
nts.html>http://groups.yahoo.com/group/lisa<http://groups.yahoo.com/group/li 
sa-its/files/ITS-Requirements/ITS-Requirements.html>-its/files/ITS-Requireme 
nts/ITS-Requirements.html.

A second step would be to define how these localization properties can be 
specified:
    * At the format level (i.e. as a property of the given type of 
document). For example: in a special rule file, or directly in schemas, etc.
    * At the document level (i.e. as localization directives embedded in 
each document when needed). For example: an element <para> may be specified 
as translatable at the format level, but in some occurrences, a given 
sentence in a given <para> element might need to remain in the source 
language. This would be done, presumably, through a standard namespace 
useable in any XML documents, similarly, for instance, to XLink.
A third step would be to define a mechanism to have an identical way of 
marking up non-XML material. This could help for example in the 
localization of script, CSS style-sheets, etc. For example: a simple prefix 
convention in comments and the same properties and syntax as the XML 
namespace could bring localization directives in any format supporting 
comments.


4. Some Examples

One example is better than hundred words. So here are some illustrations of 
how the items described above could be addressed. Those are obviously just 
example of *possible* ways, among others, to provide localization 
information. [These examples are based on ideas emitted during discussions 
in the ITS group, and coming, among others, from Richard Ishida and 
Shigemichi Yazawa.]


4.1. Definition of Localization Properties

One way to define localization properties for a given vocabulary is to use 
a specialized rule file (itself an XML application hopefully) that describe 
the default properties for both attributes and elements, and then specify 
the exceptions, using XPath to identify the type of nodes where they should 
be applied.

The following is an example of a source material. Only the bolded text is 
translatable:

<?xml version="1.0" ?>
<dialogue xml:lang="en-gb">
  <rsrc id="123">
   <component id="456" type="image">
    <data type="text">images/cancel.gif</data>
    <data type="coordinates">12,20,50,14</data>
   </component>
   <component id="789" type="caption">
    <data type="text">Cancel</data>
    <data type="coordinates">12,34,50,14</data>
   </component>
  </rsrc>
</dialogue>
The following is an example of the rule file that specify localization 
properties for the type of file like the example above:

<?xml version="1.0" ?>
<locprop version="0.1">
  <rules name="Example1" root="dialogue">
   <element-defaults localize="no"/>
   <attribute-defaults localize="no"/>
   <rule item="//component[@type='caption']/data[@type='text']" 
localize="yes"/>
  </rules>
</locprop>


4.2. Localization Directives in XML

The following is an example of how localization directives (in bold) could 
be used in an XHTML document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"DTD/xhtml1-strict.dtd">
<html 
xmlns="<http://www.w3.org/1999/xhtml>http://www.w3.org/1999/<http://www.w3.o 
rg/1999/xhtml>xhtml"
       xml:lang="en" lang="en"
       xmlns:loc="urn:the-localization-directives-standard">
  <head>
   <title loc:id="100">Title</title>
  </head>
  <body>
   <h1 id="101">Introduction to <loc:span term="yes">Document 
Management</loc:span></h1>
   <p id="102">Our company, <loc:span localize="no">Infinite Wisdom 
Inc.</loc:span>,
provides quality courses on how to manage your documentation.</p>
  </body>
</html>
Obviously, marking up a document should be done only for exception. For 
example of a specific element of a vocabulary is always a term, or always 
not to localize, or has always a specific length restriction, such 
information should be defined at the vocabulary level rather than in each 
document.

4.3. Localization directives using comments

The following

@charset "iso-8859-1"
*:lang(en)    { font-family: Arial; }
/*loc:note Text automatically placed in front of a note element. */
note:before   { "Note:"; color: red; }
note          { font-weight: bold; }
  --end--

Cheers,
-yves

Received on Thursday, 7 March 2002 23:37:16 UTC