Re: Proposal for isolation characters in Unicode and the unicode-bidi:isolate and unicode-bidi:plaintext definitions from Asmus Freytag on 2012-07-22 (www-style@w3.org from July 2012)

From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Sun, 22 Jul 2012 10:08:53 -0700
To: Matitiahu Allouche <matitiahu.allouche@gmail.com>
CC: "'Aharon (Vladimir) Lanin'" <aharon@google.com>, 'Glenn Adams' <glenn@skynav.com>, "'Martin J. Dürst'" <duerst@it.aoyama.ac.jp>, 'W3C style mailing list' <www-style@w3.org>, public-i18n-bidi@w3.org
Message-ID: <500C33A5.3050309@ix.netcom.com>
If the desire for isolates is to allow the safe insertion of unrelated 
text, then the rules have to be such that no matter which characters the 
embedded text contains, it may not have any effect on the formatting of 
text around it.

The one exception would be unmatched BDI/PDI, Instead of a sticking a 
single BDI/PDI pair around the text to be inserted, the inserting 
implementation would have to either add balancing BDI or PDI characters 
in front / back to balance the insertion, or, depending on context, 
perhaps disallow (filter) such characters from inserted text. With 
either strategy, the insertion could be self-contained.

Filtering BDI/PDI from user input would seem a natural option when 
constructing a message from a template with short, user-specified, 
insertions, and is cheaper to implement than "rebalancing".

The symmetric approach would require the implementation to balance 
"classical" embeddings and overrides in insertions (filtering would be a 
much less desirable option, as we expect overrides, in particular, to be 
quite legitimate in those types of user input).

If not balanced, the sequence
     [BDI]Having fun[PDF][PDI]
would close any open overrides and thus affect the text following the 
insertion.

Matis algorithm below does not seem to help with this case.

A./

On 7/22/2012 9:20 AM, Matitiahu Allouche wrote:
>
> Aharon Lanin wrote:
>
> "…some user codes their name as "Having fun[RLO]", it will cause the 
> rest of the paragraph in which this user's name appears to come out 
> backwards. However, putting each user name in an isolate will prevent 
> that - but only if option 2 is used. The symmetrical approach does not 
> have that property."
>
> The sequence is:    [BDI]Having fun[RLO][PDI]
>
> In option 3 (symmetric approach), the handler for PDI will detect that 
> there is an unbalanced RLO and will close it.
>
> The algorithm is:
>
> -When encountering PDI, go back to the last BDI and close every open 
> scope since the last BDI.
>
> -When encountering PDF, go back to the last LRE/RLE/LRO/RLO and close 
> every open scope since that last formatting character.
>
> In the last 2 sentences, "scope" includes embeddings, overrides and 
> isolates.
>
> I also very much favor keeping CSS in sync with Unicode formatting 
> characters behavior, and vice versa. I think this can be achieved with 
> option 3 no less than with option 2.
>
> Shalom (Regards), *Mati*
>
> *From:*Aharon (Vladimir) Lanin [mailto:aharon@google.com]
> *Sent:* Saturday, July 21, 2012 10:10 PM
> *To:* Matitiahu Allouche
> *Cc:* Glenn Adams; Martin J. Dürst; W3C style mailing list; 
> public-i18n-bidi@w3.org
> *Subject:* Re: Proposal for isolation characters in Unicode and the 
> unicode-bidi:isolate and unicode-bidi:plaintext definitions
>
> The idea behind the choice of options 2 is that an isolate would 
> protect its surroundings against extra or missing PDFs in it contents. 
> For example, if I have a site that displays the name of a user, and 
> some user codes their name as "Having fun[RLO]", it will cause the 
> rest of the paragraph in which this user's name appears to come out 
> backwards. However, putting each user name in an isolate will prevent 
> that - but only if option 2 is used. The symmetrical approach does not 
> have that property. Also note that the current specification of CSS 
> isolates (separate bidi paragraphs) also has this property, so 
> retaining it means that changing the CSS spec to use Unicode isolates 
> will have fewer visible effects. In all honesty, this is the part that 
> appeals to me most about option 2.
>
> On Fri, Jul 20, 2012 at 7:21 PM, Matitiahu Allouche 
> <matitiahu.allouche@gmail.com <mailto:matitiahu.allouche@gmail.com>> 
> wrote:
>
> I am late joining this discussion, because I did not see really 
> compelling arguments in favor of option 1 rather option 2 or vice versa.
>
> Just to add to the fun, I want to suggest a third option: in the case 
> of improperly  nested embeddings/overrides/isolates, both PDF **and** 
> PDI will close all unmatched controls.
>
> Going back to examples a and b:
>
> a: RLI LRE PDI PDF
>
> b: RLE LRI PDF PDI
>
> In example a, the PDI will close the RLI and the LRE, PDF does nothing.
>
> In example b, the PDF will close the LRI and the RLE, PDI does nothing.
>
> If nothing else, this option has the merit of symmetry.
>
> However, I am not in mad love with it, and I can live with either one 
> of 1 or 2.
>
> Shalom (Regards), *Mati*
>
> *From:*Glenn Adams [mailto:glenn@skynav.com <mailto:glenn@skynav.com>]
> *Sent:* Monday, July 09, 2012 <tel:2012> 5:07 PM
> *To:* Aharon (Vladimir) Lanin
> *Cc:* Martin J. Dürst; W3C style mailing list; public-i18n-bidi@w3.org 
> <mailto:public-i18n-bidi@w3.org>
> *Subject:* Re: Proposal for isolation characters in Unicode and the 
> unicode-bidi:isolate and unicode-bidi:plaintext definitions
>
> On Mon, Jul 9, 2012 <tel:2012> at 12:28 AM, Aharon (Vladimir) Lanin 
> <aharon@google.com <mailto:aharon@google.com>> wrote:
>
> > I don't understand your logic. You say option 2 offers greater 
> forward compatibility,
>
> > but then say you are choosing 2 because forward compatibility is NOT 
> important.
>
> Not because it isn't important, but because in certain cases is LESS 
> important than another consideration. It's a trade-off.
>
> In other words, I think that well-formed documents, i.e. ones where 
> isolates and embeddings/overrides are properly nested, should display 
> as well as possible on systems that do not support isolates. That is 
> why the proposal has been modified to include PDI. On the other hand, 
> when it comes to essentially broken documents, where 
> embeddings/overrides and isolates are not properly nested, I think it 
> is more important to let isolates do their job and isolate the missing 
> and extra PDFs in the apps that do support isolates than to make the 
> document display as similarly as possible on old and new apps, when 
> apps that don't understand isolates can't possibly display the 
> document 100% as intended anyway.
>
>  > I think backward compatibility is more desirable, i.e., a system 
> that knows nothing of
>
> > isolates should work without modification,
>
>  By definition, it can't display the document 100% as intended. We 
> introduce PDI is so its disability is limited to displaying isolates 
> incorrectly (but then limit this to when isolates and 
> embeddings/overrides are properly nested).
>
>  > and yet option 2 requires PDI to close an embedding/override,
>
>  Only when the isolate began before the embedding/override. If we have 
> LRE RLI PDI PDF, the PDI only closes the isolate, not the embedding.
>
>  That stills leaves that case where pre-PDI implementations would 
> behave differently than PDI aware implementations, since the former 
> would not close the embedding/override at the same position. I believe 
> that may be a problem, and should be avoided.
>
Received on Sunday, 22 July 2012 17:09:27 UTC