- From: Carrasco Benitez Manuel <manuel.carrasco@emea.eudra.org>
- Date: Fri, 27 Mar 1998 10:53:40 -0000
- To: "'Martin J. Duerst'" <duerst@w3.org>
- Cc: "'w3c-translators@w3.org'" <w3c-translators@w3.org>
> > This is what I have in mind: nothing new from an HTML point of > > view, just using the existing mechanism in a certain way (it would > more > > than the ID, but not much more). In other words, every document > marked > > for parallel text must conform to HTML. One needs a specification > for > > parallel texts. > > Hello Tomas, > > I still don't understand you. Two lines are parallel not by the fact > that > they are specified to be parallel, but just by the fact that they are > going in the same direction. In the same way, two texts can be > parallel > even if there is no specification at all. [Carrasco Benitez Manuel] A few definitions: "Parallel Texts Texts that are translations of each other. For example, the Treaty of Rome in English and Spanish are Parallel Texts. Parallel Texts could be aligned to several levels. Alignedness It is a quality of Parallel Texts; for example, the Treaty of Rome in English and Spanish are Parallel Texts and they should be aligned. The interesting part is aligning Parallel Texts automatically. Level of Alignedness This is a metric of alignedness. According to which depth it is possible to identify the Linguistic Objects, the texts are aligned at: Document level: the trivial case; i.e., Parallel Texts. Paragraph level: not too hard to achieve. Sentence level: desirable and possible to achieve. Term level: it needs tagging for automatic alignedness. Word level: it needs tagging for automatic alignedness. In this context, sentence is a part of a text delimited by a dot, semicolon or similar; i.e., it has little grammatical meaning and the main interest is to identify Linguistic Objects." Parallel texts without any marking are parallel text. But parallel texts with marking can be *easily* aligne (without the use of more costly techiques such as linguistic, statistics, etc) and the result is *certain* (i.e., this sentece is the translation of this one because the marking around say so). > What is the additional benefit of such a specification? How would it > look > like? [Carrasco Benitez Manuel] Without specifications there would be many different marking system (the same from a functional point of view). With a specification, there are more chances that the would be compatible products, that the main stream browsers include aligning capacities, etc. From here, other Language Engineering technique could follow; e.g, automatic harvesting of Linguistic Objetcs (terms, senteces, etc). The bottom line is to reduce the cost of translation (translation is very expensive) and to increase the quality. > > > I think it would be very good if you could take a very small example > text, translate it to another language or two, add the markers that > you think you need, and make a web site out of it that you can give > people to have a look at. [Carrasco Benitez Manuel] Execellent idea. I could take a small section in two languages (e.g., English and Spanish) and mark it up. Comments ? Regards Tomas
Received on Friday, 27 March 1998 05:56:21 UTC