Re: Sync languages - HTML

> From: murray@spyglass.com (Murray Altheim)
> Subject: Re: Sync languages - HTML
> 
> Gary Adams - Sun Microsystems Labs BOS <gra@zeppo.East.Sun.COM> writes:
> >M. T. Carrasco Benitez writes:
> > > Which technique should be used to help syncronized multilingual parallel
> > > texts marked in HTML.
> [...]
> >If possible the solution should take into account the complete
> >"application" of the parallel text. That could mean more than
> >just a single markup construct, but a combination of entities that
> >collectively provide the desired effect. It would also be good if the
> >HTML solution was forward compatable with XML for quicker adoption in
> >the more general SGML application.
> >
> >e.g. The HEAD LINK element may be used to identify the language
> >variants of a particular document. The CLASS attribute on an Anchor or
> >Paragraph could identify an alignment point with a unique ID to mark
> >the common structural elements.
> >
> > <P ID=p1 CLASS=alignment          <P ID=p1 CLASS=alignment
> >    LANG=en_US> ...                    LANG=fr_CA> ...

This example was intended to be two separate documents with the same ID.
This would be the more typical arrangement for parallel text. The case where
parallel text occurs in the same document might require some other form
of alignment. e.g. my Greek-English New Testament uses left page and right page
separation. For online interlinear text, one might use TABLEs to produce
aligned layouts. e.g. <TABLE><TR> <TD LANG=en> ... <TD LANG=fr> ...</TR>
In thise case the explicit alignment would not require further labeling,
such as the CLASS and ID solution proposed for cross-document alignment
points. (I can imagine a browser that could selectively "fold" tables, 
to provide selectable columns based on a variety of CLASS controled
attributes, similar to spreedsheet hidden cells).

> 
> Gary,
> 
> Unfortunately, this solution is invalid markup, as two IDs in the same
> document instance cannot be identical. But it's certainly possible to use a
> system where the two IDs are numerically the same but have a different
> alpha, such as
> 
>   <P ID=en.1 CLASS=alignment          <P ID=fr.1 CLASS=alignment
>      LANG=en_US> ...                    LANG=fr_CA> ...

I like this idea of including hierarchical information in the ID, but the
use of the LANG info in the ID could lead to some confusion. It may be necessary 
in a single multilingual parallel document to explicitly declare equivalent
sections with some form of Anchor. In other words the semantic importance of the
alignment is not the matching language, but the matching content rendered in
a different language.

  <A NAME=appendixa>
  <P ID=1 CLASS=alignment
     LANG=en_US> ...

   ...
   <A HREF=#appendixa>
   <P ID=12345 CLASS=alignment
      LANG=fr_CA> ...

> 
> BTW, either hyphens or periods are OK in ID, but not underscores.
> 
> Murray
> 
> ```````````````````````````````````````````````````````````````````````````````
>     Murray Altheim, Program Manager
>     Spyglass, Inc., Cambridge, Massachusetts
>     email: <mailto:murray@spyglass.com>
>     http:  <http://www.cm.spyglass.com/murray/murray.html>
>            "Give a monkey the tools and he'll eventually build a typewriter."
> 
> 
> 

Received on Tuesday, 14 January 1997 11:48:21 UTC