Re: Bidi: now I'm confused (issue [bidiDigits-18])

Hello Roy,

I think that in general, you are right about your analysis.
Having labels (or other components) with numbers only may
lead to ambiguous displays. I seem to remember that we were
actually aware of that fact, but there was not much to do
about it:

- There currently are labels with only digits in the DNS,
   outlawing them is not an option. (it would have been nice
   if we could have said that the same restrictions apply
   for digits and LTR letters as they do for digits and RTL
   letters)
- Very explicitly for IDN, but also in many ways for IRIs,
   it is highly undesirable to have inforced restrictions
   on two or more labels/components. (note that this may be
   somewhat different for the LHS side)

I have created an issue for this for the IRI draft, at
http://www.w3.org/International/iri-edit#bidiDigits-18.

I propose to address this by adding text that points out
such cases and warns against them (without going as far as
actually prohibiting them). I hope that this is acceptable
for you.

By the way, the alternative of having components displayed
strictly LTR was what we had for a long time. The two problems
with this approach are:
- It does not seem to correspond with what Arabic and Hebrew
   writers do naturally, in particular for freestanding domain
   names.
- It would require much more control over the contexts of
   IRI display than we think will be available (if we get
   an overall context of LTR reasonably widely implemented,
   I think we already have achieved something).

Regards,    Martin.



At 15:08 03/09/07 +0100, Roy Badami wrote:


>Ok, I have a problem with what I understand to be the display model
>for IDNA and IRI (and presumably by extension IMA).
>
>I'm assuming that the display model is 'render using bidi in an LTR
>context'.
>
>Specifically, the IRI draft says:
>
>   When rendered, bidirectional IRIs MUST be rendered using the Unicode
>   Bidirectional Algorithm [UNIV4], [UNI9]. Bidirectional IRIs MUST be
>   rendered with an overall left-to-right (ltr) direction.
>
>The latter requirement isn't specified in bidi-speak, but is
>presumably to be interpreted as saying they must be rendered at an
>even embedding level.  Actually, this isn't quite enough in the
>general case, since what comes before the string may affect weak type
>resolution, but since IRIs generally start with a latin letter
>(generally 'h' :) this isn't really much of a problem.
>
>So lets for the moment assume that the display model is that IDNs,
>IRIs, IMAs are rendered at an even embedding level, such that the
>IDN/IRI/IMA constitutes the sole text in the level run.  (This can
>easily be achieved by bracketing the string with LRE and PDF prior to
>rendering.)
>
>Consider the domain:
>
>123.ARAB.com (logical order)
>123.BARA.com (display order)
>
>now consider the domain:
>
>ARAB.123.com (logical order)
>123.BARA.com (display order)
>
>Ergo, we need another display model; this one doesn't work, at least
>not if we don't want two completely different domains to display
>identically.
>
>I recall that there was a proposal on the IDN list that domains should
>always be rendered with the labels appearing in order, least
>significant to the left and top-level domain on the right.  (This can
>be trivially achieved by bracketing each label with LRE/PDF,
>separating the labels with dots, and then bracketing the whole domain
>with LRE/PDF.)
>
>This would solve the above problem, but potentially might be less
>friendly to users of RTL languages in other ways.
>
>It also clearly is not what the authors of stringprep had in mind,
>since the bidi restrictions in stringprep are much stronger than would
>be necessary if this was the model.
>
>         -roy
>

Received on Monday, 8 September 2003 15:50:22 UTC