W3C home > Mailing lists > Public > public-i18n-its@w3.org > April to June 2005

[ESW Wiki] Update of "its0505WordCount" by TimFoster

From: <w3t-archive+esw-wiki@w3.org>
Date: Mon, 23 May 2005 09:59:27 -0000
To: w3t-archive+esw-wiki@w3.org
Message-ID: <20050523095927.12642.82870@localhost.localdomain>
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification.

The following page has been changed by TimFoster:
http://esw.w3.org/topic/its0505WordCount


The comment on the change is:
Sample of translatable but not word-countable text

------------------------------------------------------------------------------
  '''[[YS-''' I'm not sure about this. Wouldn't the requirement about word-count in the point of view of ITS, be about how to indicate which parts of the document should be counted, vs. which parts should not? Like in the original discussion here [http://lists.w3.org/Archives/Public/public-i18n-its/2005JanMar/0003.html].
  
  (And, by the way, in that aspect, do we have cases where 'is translatable' and 'should be counted' are not the same?)
+ 
+ '''[[TF Yep, From experience, and in our XLIFF filter impl. :-) There are cases where you can recognise where there may be translatable content, but have no idea how to wordcount the text - our case was program listings where we have no idea which programming language is being demonstrated and the code example isn't internationalised eg. the following fragment of Docbook : '''
+ 
+ {{{
+ <para>This is a section of java code :
+ <programlisting>
+   // this string is never used
+   String sqlConnect = "connect / as sysdba";
+   String sqlSelect = "select name from mytable where name=\"Tim\"";
+   System.out.println("But of course, you should translate this string !");
+ </programlisting>
+ <para>
+ Note that the section above will probably cause confusion to word-count-algorithms.</para>
+ }}}
+ ''' this is a tough one to crack : basically in our XLIFF impl, we just had to admit that there may be translatable text in the programlisting, but we don't know how to wordcount it  end TF]]'''
  
  I realize the importance of a common way to calculate the counts, but something like GMX seems, maybe, out of scope, has it's not something to do with making documents, in general, easier to localize.
  
Received on Monday, 23 May 2005 10:24:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:44 GMT