RE: [all] Request for Last Call Review, Internationalization Tag Set (ITS) from Yves Savourel on 2006-05-22 (public-i18n-its@w3.org from April to June 2006)

From: Yves Savourel <yves@opentag.com>
Date: Mon, 22 May 2006 09:39:55 +0200
To: <jerry.carter@nuance.com>
Cc: <public-i18n-its@w3.org>
Message-ID: <004b01c67d72$ec514290$640fa8c0@Breizh>
Hi Jerry,

Thank you for your feedback. We've discussed your notes a bit in the past weeks, and I was tasked to summarize our answers.

If some of them still do not address your concerns, please let us know and we will enter the specific points in our issue list, so
it's treated along with other comments.


> 6.2 Translatability.
> 
> Translatability is presented as a binary decision and in section 1.1.2 
> the phrase 'Model-T' is given as an example of an item that would be invariant.
> However, a French translation would likely replace Model-T with 'Ford T'.
> Terms of art or product names do vary more than one might expect.  
> Consider the recent 'Buick LaCrosse' sedan whose name has required 
> translation for certain markets [2].
>
> Likewise, is '12' translatable?  One might wish to express the integer 
> in Chinese ideographs or leave it in Arabic numerals depending on 
> content or based on the understanding of the likely reader.
> 
> I do not see translatability as a straightforward decision.  
> Annotation data describing the term could very well be useful to the 
> translator, but eventually one wishes for word or phrase level 
> descriptions a la WordNet [3] to guide the translator, a capability 
> that does not appear to be supported by ITS.

I think you have a point for Example 1. We'll try to find a better example of 'not-translatable' text.

As for the note about seeing translatability as a binary decision. While we would agree in general with your comments, but we think
we are trying to achieve something a bit different: In practice it is unlikely that one is going to (or can) set translatability at
a very fine level, especially when working from the source viewpoint: The decision to translate or not is a really made for each
target language when the translator go through the text.

To some degree this is related to a discussion the group had early on the naming the attribute. 'translate' was the choice we made.
And by this we mean: From the viewpoint of the XML content, this text is translatable text, make it accessible to the translators.
How this is going to be translated (all of it, or part of it, or none of it) is to be decided by the translator and the decision may
be different for each language.


> 6.3 Localization Information
> 
> Here general annotation information is provided.  We considered RDF 
> while working on PLS and I wonder if RDF might be more appropriate here.

We thought RDF was a bit too much for the purpose locInfo tries to achieve, which is quite simple: associate a simple note to a part
of content. In addition, we needed to provide an association mechanism to allow the re-use of existing localization notes.

 
> 6.4 Terminology
> 
> This could be a subclass of 6.3.

We thought identifying terms and associating localization notes to text were different enough to require two distinct data
categories.


> 6.5 Directionality
> 
> The unicode character set already supports embedding of directionality 
> marks and overrides (e.g. 0x200E, 0x200F, 0x202D, 0x202E, 0x202C) when 
> specifications do not make provisions for explicit elements such as 
> the XHMTL bidirectional text module [4].
> 
> Is this necessary?

We followed Unicode's own recommendation (http://www.w3.org/TR/unicode-xml/#Charlist), and the guidelines provided
by the GEO WG (http://www.w3.org/International/questions/qa-bidi-controls).


> 6.6 Ruby
> 
> Again, this is back to the general annotation issue.  Is Ruby best 
> applied for this purpose?

Here again, we tried to integrate existing recommendations (Ruby Module).


> 6.7 Language Information
> 
> Specifications that do not support language tagging might be broken.  
> Is this really the best way to fix them?  Document markup that does 
> support language tagging in a non-traditional manner is presumably 
> okay since readers are expected to understand the semantics of that 
> specification.  I don't see the 'langPointer="@mylangattribute"' case 
> as justifying this capability.

We don't think langRule tries to fix broken specifications, just to allow ITS-aware applications to know something they cannot know
because they do not have a semantic knowledge of the given formats. We still recommend using xml:lang if possible. This will be made
more clear from the Best Practices document (still a First WD)

In short, we say: you should use xml:lang. But if you have already something that has the same semantic and value set, and can't
change it to xml:lang, then you can use langRule to indicate that to ITS-enabled application.


> 6.8 Elements within Text
> 
> Here again, I would expect the reader to understand the semantics of 
> the document markup.

Interesting comment: It shows we may have not done good enough job to explain that many of the ITS is mainly made for applications
working at a 'generic' level, rather than applications knowing the specific semantics of each XML vocabulary.

For example, a generic XML spell-checker would not know the semantic associated with the elements of each document types. But it
would just need to understand ITS to know which parts of the document to check.


Thanks again for your help,
-yves
Received on Monday, 22 May 2006 07:46:41 UTC