RE: Parallel texts

Hello Tomas,

Many thanks for your mail.


> 	A few definitions:

Thanks for copying these in. I remember them very well.


> 	Parallel texts without any marking are parallel text.  But
> parallel texts with marking can be *easily* aligne (without the use of
> more costly techiques such as linguistic, statistics, etc) and the
> result is *certain* (i.e., this sentece is the translation of this one
> because the marking around say so).
> 
> > What is the additional benefit of such a specification? How would it
> > look
> > like?
> 	[Carrasco Benitez Manuel]  
> 	Without specifications there would be many different marking
> system (the same from a functional point of view).  With a
> specification, there are more chances that the would be compatible
> products, that the main stream browsers include aligning capacities,
> etc.

What is needed in addition to <DIV>, <SPAN>, and ID, in HTML?
If anything more, how should it look like?
What would prevent a small browser company from producing a browser
with alignement capacities? What's the reason nobody seems to do it?


> 	From here, other Language Engineering technique could follow;
> e.g, automatic harvesting of Linguistic Objetcs (terms, senteces, etc).

That's very interesting. It could help search engines.


> 	The bottom line is to reduce the cost of translation
> (translation is very expensive) and to increase the quality.

The problem is that in many cases, the original authors don't
know that the document will be translated (the EU in this respect
may be an exception). So the original authors don't think about
how to make translations easier.

I'm seeing this almost weekly now, with a very small example, the
W3C press releases. For the last few (XML REC, MathML PR, CSS2 PR),
they were translated here into Japanese, and they are served
language-negotiated, as it should be. The main problems are:
- That we don't get final enough versions early enough (is getting better)
- That there are too many "buzzwords", and they are extremely
  difficult to translate if you don't know all the technical content.
- Also, that "buzzwords" may be okay for the US press, but not for the
  Japanese press, for various reasons.
- That for the Japanese, different things may be of interest.

So e.g. in the case of XML, we just did a rewrite. The document is
parallel, but below that, not much of a chance for alignement.


> > I think it would be very good if you could take a very small example
> > text, translate it to another language or two, add the markers that
> > you think you need, and make a web site out of it that you can give
> > people to have a look at.
> 	[Carrasco Benitez Manuel]  
> 	Execellent idea.  I could take a small section in two languages
> (e.g., English and Spanish) and mark it up. Comments ?

Yes, please.

Regards,   Martin.

Received on Friday, 27 March 1998 19:28:42 UTC