W3C home > Mailing lists > Public > www-style@w3.org > August 1995

Re: LANG as an attribute

From: lilley <lilley@afs.mcc.ac.uk>
Date: Tue, 1 Aug 1995 11:22:24 +0100 (BST)
Message-Id: <10240.9508011022@afs.mcc.ac.uk>
To: Jon_Bosak@novell.com
Cc: html-wg@oclc.org, www-style@www10.w3.org
Jon Bosak said:
> [Tom Neff:]
> >  2. A browser for Hebrew or Chinese might well attempt, if it knows the
> >     language is appropriate, to display text right-to-left or
> >     top-to-bottom in future.  Are there presentation issues or risks
> >     inherent in allowing a character-level switch from, say, Hebrew to
> >     Spanish and back?

> You bet.  It's an implementation nightmare.  It's not impossible, but
> people who have studied this tell me that it's not nearly as simple as
> it looks at first glance.

It certainly does appear doable at first glance.

> This subject received a great deal of discussion (most of it before I
> became involved) in the DSSSL working group.  I don't know all the
> details, but I do know that no one is expecting bidirectional
> capabilities in DSSSL-Lite.

Oh, great. Well I hope that CSS manages it then, bacause as Jon says:

> This happens fairly often in scholarly publications devoted to Near
> Eastern studies and probably to a certain extent in Hebrew and Arabic
> works that quote European languages.

Quite. So it is a common requirement and DSSSL-Lite does not intend to 
address it. Anyone else see this as a problem?

> Note that we're talking specifically about embedding a fragment of a
> right-to-left language like Hebrew or Arabic in the middle of a
> passage written in a left-to-right language like English or Spanish.

Yes. Understood.

Now, let us presupose routines already exist to lay out a line of type 
left to right and to put fully rendered lines onto the output device.

Let us arbitrarily represent, in this example, Hebrew letters as some entity 
set, as much for convenience in email as anything else.

Let us also suppose that we have a full, scalable  unicode font at our 
disposal - because the issue was setting bidi type, not handling missing 

The input text is being streamed in and we cannot look ahead. Actually 
we could probably look ahead with a small amount of buffering  but lets 
try without first.

The line under construction so far is:

left^--------------------------------^right margins
    is expressed in Hebrew a
                            ^========== insert point

Next letters are s and space

left^--------------------------------^right margins
    is expressed in Hebrew as 
                              ^========== insert point
Next letter is &beth; and the LANG has switched to Hebrew. We continue to 
insert characters from left to right, but do not move the insertion point.
I will represent the Hebrew letters here as punctuation !@#$% to illustrate 
where the letters are placed.

left^--------------------------------^right margins
    is expressed in Hebrew as $
                              ^========== insert point
Does the total line length exceed the margin? No so we add another one, 

left^--------------------------------^right margins
    is expressed in Hebrew as #$
                              ^========== insert point

And so on

left^--------------------------------^right margins
    is expressed in Hebrew as }! &%$#
                              ^========== insert point

Ah. adding the next letter would take us over the margin. So we save 
the word we are currently inserting: }!

left^--------------------------------^right margins
    is expressed in Hebrew as    &%$#
                              ^========== insert point
Display the line, clear the line under construction, add the saved word 
and continue:

left^--------------------------------^right margins
    ^========== insert point
After a bit we switch to English again:

left^--------------------------------^right margins
    ^========== insert point
So the insertion point moves to the end of the current string

left^--------------------------------^right margins
         ^========== insert point
and off we go.

left^--------------------------------^right margins
    @*+{! or in other words 
                           ^========== insert point

Now with a little tidying up - a one character lookahead to stop 
spaces being the first character of a line and to allow a one character 
rollback if the next character was a space and the last character was 
one of the five letters that have final forms   - that seems like a 
reasonable starting point for laying out bidi text.

Chris Lilley, Technical Author
|       Manchester and North HPC Training & Education Centre        |
| Computer Graphics Unit,             Email: Chris.Lilley@mcc.ac.uk |
| Manchester Computing Centre,        Voice: +44 161 275 6045       |
| Oxford Road, Manchester, UK.          Fax: +44 161 275 6040       |
| M13 9PL                            BioMOO: ChrisL                 |
|     URI: http://info.mcc.ac.uk/CGU/staff/lilley/lilley.html       | 
Received on Tuesday, 1 August 1995 06:22:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:53:42 GMT