RE: Proposed text for ITS spec introductory material from Lieske, Christian on 2006-03-24 (public-i18n-its@w3.org from January to March 2006)

From: Lieske, Christian <christian.lieske@sap.com>
Date: Fri, 24 Mar 2006 17:23:14 +0100
To: "Richard Ishida" <ishida@w3.org>, <public-i18n-its@w3.org>
Message-ID: <0F568FE519230641B5F84502E0979DD104C646CB@dewdfe12.wdf.sap.corp>
Hi Richard,

We discussed this during today's editor's call. There was consensus that
the spec. would
benefit from the additional explanations you are suggesting. We have
come up with the following ideas
of how to accomodate your suggestions:

A. We will intergrate the additional paragraph you are suggesting for
section 1.

B. We will take your proposed sections 2.1 and 2.2 and place it at the
beginning of our "Introduction".
   This would give us an new outline for the introduction along the
following lines

	1 Introduction
		1.1 Users and Usages of ITS
		1.2 Motivation for ITS
		1.3 Out of Scope
		1.4 Important Design Principles
		1.5 Development of this Specification

C. We will take your proposed sections 2.3 - 2.7 and add it to our
"Basic Concepts".

Of course, I will not be possible to take over your material "as is"
since for example recent disussions in the
group have outdated some of it (like the bit about "schemaRule").

It would be great if you could let us know if this proposal looks
alright to you. I could then start working
on the actual integration.

Best regards,
Christian
-----Original Message-----
From: public-i18n-its-request@w3.org
[mailto:public-i18n-its-request@w3.org] On Behalf Of Richard Ishida
Sent: Freitag, 24. Februar 2006 19:37
To: public-i18n-its@w3.org
Subject: Proposed text for ITS spec introductory material


Chaps,

I began reading the latest version of the ITS spec in preparation for
next week.  I have scribbled a number of editorial comments on my paper
copy, but I felt like a couple of the introductory sections needed more
than that.  I also figured that it would be almost as easy for me to
write alternative text as to clarify what I'm thinking so that someone
else could evaluate and/or implement it.  So I wrote what follows.  It
is only a first pass, so there may be things that can be improved.


SECTION 1.1


Section 1.1 still talks only about localization and still ignores
international use of schemas.  I propose the following replacement text
for the first two paragraphs:

[[
Content or software that is authored in one language (i.e. source
language) is often made available in additional languages. This is done
through a process called localization, where the original material is
translated and adapted to the target audience.

In addition, document formats expressed by schemas may be used by people
in different parts of the world, and these people may need special
markup to support the local language or script.  For example, people
authoring in languages such as Arabic, Hebrew, Persian or Urdu need
special markup to demarcate directionality in mixed direction text.

>From the viewpoints of feasibility, cost, and efficiency, it is
important that the original material should be suitable for localization
and international use. This is achieved by appropriate design and
development, and the corresponding process is referred to as
internationalization. For a detailed explanation of the terms
"localization" and "internationalization", see [l10n i18n].
]]




SECTION 2


Section 2, Basic Concepts, still hits me as a description of ITS from
the engineer's point of view, rather than describing the how it
intersects with the potential user's interests.  (It's like having
documentation for Powerpoint that just went through the pull-down menus
in order, rather than having sections such as 'How to create a new
presentation', 'How to work with the master', etc.)  After reading this
section I find I still have to work hard at reassembling the information
in my brain in terms of what I knew when I started and where I am now,
and in terms of where we're going with this.

Here is a proposal for an alternative approach.


[[

2 Basic Concepts

2.1 Potential users of ITS

The ITS specification aims to provide schema developers with information
about what markup should be supported to enable worldwide use of their
schemas and effective localization of the content developed using that
schema. This information is provided in an abstract way in the data
category descriptions, but specific proposals for implementation are
also made in the specification.

One group of people who will use this information will be developing new
schemas from the ground up.  In the specification they will find
proposals for attribute and element names to be included in their new
schema.  Using the same names as proposed here may be helpful because it
leads to easier recognition of the concepts represented by both authors
and localization tool developers.  It is perfectly possible, however,
for the schema developer to develop their own set of tag and element
names.  The specification sets out, first and foremost, to ensure the
required markup is available, and that the behaviour of that markup
meets established needs.

Another group of users of this specification will be working with
existing schemas, such as DocBook, DITA, or perhaps an in-house schema.

The ITS Working Group has sought input from people developing widely
used formats such as DocBook and DITA, and specification provides
examples of how we feel those specific formats could be adapted to
support ITS.

Developers working on existing schemas should check whether their
schemas support the markup proposed in this specification, and, where
appropriate, add the markup proposed here to their schema. 

In some cases, the schema may already contain markup equivalent to that
recommended in ITS.  In this case it is not necessary to add duplicate
markup. The developer should, however, check that the behaviour
associated with the markup in their own schema is fully compatible with
the expectations described in this specification.

Other users of the ITS specification will be translation tool
developers.  When content is sent for translation, it is important to
ensure that such tools recognize what to do with the various bits of
content described by the markup. For example, translation tools should
prevent content marked up as not for translation from being changed.  It
is hoped that the ITS specification will make the job of these
developers easier by standardising the expected behaviour of certain
relevant markup items, and allowing them to more effectively identify
how content should be handled.

The markup proposed in this specification may also be used by content
authors to mark up specific bits of content.  However, we will describe
below how the burden of inserting markup can sometimes be removed from
content authors and the data categories can be related to relevant bits
of content in a more global manner.  This work may fall to information
architects, rather than the content authors themselves.



2.2 Ways to implement ITS

The ITS specification provides a set of element and attribute names that
can be included in a schema, but it also goes beyond that to specify a
mechanism for describing various aspects of a schema in terms of
translatability and internationalization.

We will explore the possible approaches below. For the purpose of
illustration, we will use examples of ways to indicate that certain
parts of content should or should not be translated.  There are three
ways of indicating this information:

1. a content author uses an attribute on a particular element in the
content to say that the text should not be translated

2. a document developer uses markup at the top of the document to
identify a particular type of element or context in which the content
should not be translated

3. a schema developer uses constructs in the schema itself to indicate
that specific parts of the content should not be translated.

The first two approaches above can be likened to the use of CSS in
XHTML.  Using a style attribute, an XHTML content author may assign a
colour to a particular paragraph. That author could also have used the
style element at the top of the page to say that all paragraphs of a
particular class or in a particular context would be coloured red.



2.3 Using local markup

Example 4 shows how a content author may use an ITS attribute to
indicate what text should be translated and what text should be
protected from translation.  Translation tools that are aware of the
meaning of this attribute can then screen the relevant content from the
translation process.

Example 4 goes here [Note the relevant parts of these examples should be
bolded for easy identification using a tag such as strong for WAI
accessibility]

For this to work, the schema developer will need to add the
its:translate attribute to the schema as a common attribute or on all
the relevant element definitions.  

Note how there is an expectation in this case that inheritance play a
part in identifying which content is to be translated and which not.
Tools that process this content for translation will need to manage the
scoping.



2.4 Using documentRule directives

Example 5 shows a different approach to identifying non-translatable
content, similar to that used with a style element in XHTML, but using
an ITS-defined element called its:documentRules.

Example 5 goes here

The head of a document can contain an its:documentRules element, which
contains one or more documentRule elements. In addition to one or more
ITS data category attributes, the documentRule element contains a
corresponding set of ITS selector attributes (in the example
translateSelector). As their name suggests, they select (or designate)
the XML node or nodes to which a corresponding ITS data category
attribute pertains. The values of ITS selector attributes are XPath
absolute location paths. Information for the handling of namespaces in
these path expressions is contained in the ITS element ns which is a
child of documentRules.

This approach has the following benefits:

-      Content authors do not have to concern themselves with creating
additional markup or verifying that the markup was applied correctly.
ITS data categories are associated with sets of XML nodes (for example
all p elements in an XML instance)

-	Changing the rules can be done in a single location, rather than
by searching and modifying the markup throughout a document (or
documents, if the documentRules element is stored as an external entity)

-	ITS data categories can designate attribute values as well as
elements.
    
-	It is possible to map ITS markup to existing markup (for example
the term element in DITA)  [Ed. not a clear example unless we show the
ITS equivalent - and I think there is none, so what about citing the
DITA translate attribute?]

For this to work, the schema developer needs to add the documentRules
and associated markup to the schema.  In some cases this may allow the
schema developer to avoid adding other ITS markup (such as an
its:translate attribute) to the elements in the schema, however, it is
likely that authors will want to use attributes on markup from time to
time to override the general rule.

For specification of the translate flag, the contents of the
documentRules element would normally be designed by an information
architect familiar with the document format and familiar with, or
working with someone familiar with, the needs of the localization group.



2.5 Using schemaRule

Example 6 shows an alternative approach to designating the XML node or
nodes to which a corresponding ITS data category attribute pertains.
This time putting the information directly into the schema itself, using
its:schemaRule.  Note that this is only possible for schemas developed
using W3C XML Schema ...

Example 6 goes here

This example defines all term elements to be non-translatable by
default.

[More text about how this works and who would do it]



2.6 Overwriting/precedence and inheritance

The power of ITS selector attributes comes at a price: rules related to
overwriting/precedence, and inheritance, have to be established.

Example 7 goes here

In this example, the ITS data category attribute  translate  appears
twice: in a  documentRule , and on a specific p element. Since the ITS
selector attribute in the  documentRule  selects all p elements, the
question arises what the value for the translate data category of the p
element which has local markup is. ITS provides precedence and
inheritance rules which answer questions like this. In the example, the
value is "no" (that is the content of the p element should not be
translated).



2.7 Using ITS elements

The above example of a translate flag applied an ITS data category to an
attribute node.  Many of the ITS data categories can be expressed using
attribute nodes, but not all.  Some are expressed using element nodes.
Here is an example of such an approach.

....
]]




Hope that helps, and that I have understood the mechanism correctly.

RI



============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://people.w3.org/rishida/blog/
http://www.flickr.com/photos/ishida/
Received on Friday, 24 March 2006 16:25:03 UTC