Translation Memory (TM) and text-transform from Richard Ishida on 2003-10-22 (www-international@w3.org from October to December 2003)

From: Richard Ishida <ishida@w3.org>
Date: Wed, 22 Oct 2003 13:58:51 +0100
To: <www-international@w3.org>, <www-style@w3.org>
Message-ID: <002401c3989c$40e90900$6401a8c0@w3c40upc3ma3j2>
See below a transcript of a mail exchange between myself and François
Richard (top to bottom order).  [Note that, in simple terms, Translation
Memory tools compare the source of a sentence awaiting translation with
that of other previously translated sentences.  If a source match is
found, the translation for the previously translated source is proposed
as the translation for the current sentence.  This helps with
consistency but can also considerably speed up translation, thereby
enabling competitive pricing.]



Francois wrote:
I have been looking around for more info on the CSS 'text-transform',
its purpose and  usage. I have the feeling that it might make the
processing of text more complex since it actually transforms characters.
I am thinking about Translation Memory tools in particular. 
I will keep looking for more info on this topic, but I thought that CSS
were specialized in text *rendering*, formatting and layout speciation,
but not on text 'altering' (if this is the right word)...

================================

Richard then wrote:
I was just discussing this same 'issue' with Yves Savourel in Boulder
last Friday. The text-transform doesn't change the characters in the
document, just shows alternate glyphs for them.  Think of it like an
Opentype font substituting contextual forms for Arabic.  For this
reason, I don't think there should be any impact on TM tools.

I think it's important to view this as purely an *alternative
decoration* for the text in the document source.  If only because you
should expect some people to view without the css styling.  I don't
think you should use text-transform to achieve the 'correct' sequence of
codes - you should change the source.

================================
Francois then wrote:
Ok. This is important to know when using text-transform. Otherwise, I
could see developers of web pages for instance starting to never use
capitalization on titles and headers and instead rely on the CSS.
Unfortunately, this would make translation more difficult if the
translation tool used do not support the CSS. But If I understood you
correctly, in such a case, the source content should make use of
capitalization.


=================================
Richard's postscript:
François and Yves are expressing concerns that I'm sure will be shared
by a large number of localization folks out there.  I think it is
important to state things clearly in the CSS spec -
http://www.w3.org/TR/CSS21/text.html#propdef-text-transform should
contain a paragraph that clearly spells out that this is only 'smoke and
mirrors'.  That it should not be relied upon to 'make the text look
right', only to apply an alternative styling effect that may not be
desirable or applicable for all languages (eg. German or Turkish).

I also suspect that TM tools might work better if they used case
independent (and even Unicode normalised) matching - possibly comparing
case as a second level differentiator where appropriate (like a sorting
algorithm).  (If you want to respond to this para, maybe just reply to
www-international).


RI
============
Richard Ishida
W3C

contact info: http://www.w3.org/People/Ishida/ 

http://www.w3.org/International/ 
http://www.w3.org/International/geo/ 

See the W3C Internationalization FAQ page
http://www.w3.org/International/questions.html
Received on Wednesday, 22 October 2003 08:59:20 UTC