W3C home > Mailing lists > Public > public-i18n-its@w3.org > April to June 2005

[ESW Wiki] Update of "its0503ReqSpan" by TimFoster

From: <w3t-archive+esw-wiki@w3.org>
Date: Wed, 11 May 2005 10:50:29 -0000
To: w3t-archive+esw-wiki@w3.org
Message-ID: <20050511105029.32607.83011@localhost.localdomain>
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification.

The following page has been changed by TimFoster:
http://esw.w3.org/topic/its0503ReqSpan


------------------------------------------------------------------------------
  
  == Background: ==
  
- This allows localisation tools to determine their behaviour on certain sections of text. This could be for sections of text that need to be translated by a domain-expert (as with source code fragments) or need special terminology in order to be properly translated. In particular, a span-like element can be useful to help translation tools determine where to apply sentence-breaks and also to assist word-counting algorithms. Other uses are foreseen, within the scope of the ITS.
+ This allows localisation tools to determine their behaviour on certain sections of text. This could be for sections of text that need to be translated by a domain-expert (as with source code fragments) or need special terminology in order to be properly translated. In particular, a span-like element can be useful to help translation tools determine where to apply sentence-breaks and also to assist word-counting algorithms. '''[TF] added text''' A span-like element is also extremely useful for marking langauge information in source files which translation tools can also use to determine which translation process to use for each given section of text (eg. a Latin quotation in a section of English text is often intended to be left in Latin for the translated version of the English text.)''' [TF] end added text''' Other uses are foreseen, within the scope of the ITS.
  
  '''[MD] This omits a very important use of the <span> element, and the main reason it was added to HTML originally: language information.
  Language information is important both for internationalization (e.g. different styling according to language) as well as localization (text needs to go to different translator, or not translated, or otherwise treated differently).'''
+ 
+ '''[TF] Good point, I've added that text above'''
  
  
  One example would be the following sentence, which contains some source code that we would like to treat specially during translation :
@@ -37, +39 @@

  Here, we would like to put a spanning element around the source code fragment to indicate that it is not standard text for translation and should be translated by a someone familiar with the Java programming language. Also, translation tools should treat the exclamation points in the sample text carefully with respect to sentence-segmentation if they perform that function.
  
  '''[RI] Hmm. This is not such a good example in my mind, since it seems to suggest that it's ok not to put System.out.println("Hello World!"); in an element such as <code>.  On the contrary, I think we should have a guideline and expectation that people will have this marked up so that a span element is not necessary.  Same goes for the output.'''
+ 
+ '''[TF] Okay, I probably didn't choose the best example here. What I'd really like to see, is some way of marking purely the translatable text in the sentence, allowing the author to clearly delimit the parts of the code tag that are translatable vs. non-translatable : right now, all <code> says is that there's source code present - it's up to tools to work out (a) what type of code it is, and (b) which parts of the code are translatable. Perhaps something like this would be a better example :'''
+ 
+ 'The statement in the Java programming language <code><its:donttranslate>System.out.println("<its:/donttranslate>Hello World<its:donttranslate>");<its:/donttranslate></code> prints the text "Hello World!" to standard output.'
+ 
+ '''[TF] The point is, I'm suggesting we shift some of the responsibility of identifying translatable vs. non-translatable content off the translation tools author (or at the very least, make recommendations to content authors to separate out the translatable vs. non-translatable portions of text more clearly (eg. leave the entire contents of <code> as non-translatable and use an entity reference or some other means to refer in the translatable text from elsewhere, eg. <code>System.out.println("&java.code.example.text;");</code>) -- but I'm going into implementation details here, and was trying to avoid that for this requirements document :-)'''
+ 
+ 
  
  This next section of text shows a filename that should also not be translated :
  
@@ -64, +74 @@

  
  I'm using the term 'target schema' here to talk about the schema that we are trying to internationalize/localize.'''
  
+ '''[TF] I guess the question is, to what level are the semantics in the target schema sufficient for the purpose of translation ? For HTML, <code> clearly indicates that the contents are source code, but doesn't say which parts of that source code are translatable, nor what programming language it contains (and of course, not every string-literal in a given programming language is translatable either - hence the need for a span-like element I think)'''
+ 
Received on Wednesday, 11 May 2005 10:56:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:44 GMT