adding &rle;,... for bidi (was: Re: Joint meeting at TPAC from HTML and i18n core WG minutes 2007-11-09)

Dear I18N WG, HTML WG,

This mail contains some comments on some proposals for adding
some named character entities for bidi support.

At 02:06 07/11/10, Felix Sasaki wrote:
>
>... are at http://www.w3.org/2007/11/09-i18n-minutes.html and below as text.

>   Ishida: Can you use bidi in filenames?
>
>   Hixie: probably, but I'm not going to recommend it
>
>   Ishida: We might need to start thinking about how to convert text
>   from markup to strings with bidi control characters.

You don't need to think about this. In HTML4, bidi markup is defined
by saying how to map it to bidi control characters. So everything
should already be done, specwise. And the implementations also have
been doing this for a long time.

[more further below]

>   <anne> (I think HTML 5 should get &rlo;, &lro;, and &pdf; (or
>   something in that direction) for BiDi. These are already in IE.)

Do you mean just &rlo; or also &rle;? The former in override, the later
is embedding. The former is used very locally, the later often more
wide-range.

>   Hixie: We did consider having a DOM attribute that would pull out
>   e.g. bidi control characters from the markup and alt text from
>   images
>   ... not sure where that's going
>   ... I would recommend finding solutions for plaintext, since that
>   will work for both
>
>   Discussion of that
>
>   language tags are in Unicode, but were deprecated as soon as they
>   were added: they were added as deprecated and should never be used
>
>   <anne> (event though the characters they map to are apparently
>   deprecated)
>
>   discussion of markup-plaintext thing
>
>   <apppp> reference RFC 3066 should point to BCP 47
>
>   Addison notes that the i18n group needs to review the date parsing
>   things
>
>   <najib> +1 for to add &rle, ..., &pdf; in HTML

Please note that if you are going to add these, you'd better
deal with interactions between these and markup. These interactions
are one of the main reasons we very strongly recommended against
using these characters (which you can use directly or as NCRs even now)
in HTML4. The other is the observation that most often, the logic bidi
structure coincides with other logic structure for which either markup
already exists or can easily be added.

Another observation is that entities and numeric character references
easily get lost (i.e. absorbed into the text stream as plain characters),
whereas markup will stay markup. To be able to look at the source and
tweak these things by hand if necessary is very helpful, but very
difficult if they aren't visible. Also, if they are invisible,
they mess up things when you look at the source, which is usually
not desirable (viewing bidi source is a problem by itself, but
making it more difficult is definitely not needed).

So I'm not saying I'm totally against adding these entities,
but I just want to make sure you don't set your hopes too high,
and you don't mislead others and give them too high hopes.


Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Received on Tuesday, 20 November 2007 06:13:58 UTC