W3C home > Mailing lists > Public > public-iri@w3.org > September 2003

Bidi: now I'm confused

From: Roy Badami <roy@gnomon.org.uk>
Date: Sun, 7 Sep 2003 15:08:56 +0100
Message-ID: <16219.15352.612051.84152@moriarty.gnomon.org.uk>
To: ietf-imaa@imc.org
Cc: public-iri@w3.org


Ok, I have a problem with what I understand to be the display model
for IDNA and IRI (and presumably by extension IMA).

I'm assuming that the display model is 'render using bidi in an LTR
context'.

Specifically, the IRI draft says:

  When rendered, bidirectional IRIs MUST be rendered using the Unicode
  Bidirectional Algorithm [UNIV4], [UNI9]. Bidirectional IRIs MUST be
  rendered with an overall left-to-right (ltr) direction.

The latter requirement isn't specified in bidi-speak, but is
presumably to be interpreted as saying they must be rendered at an
even embedding level.  Actually, this isn't quite enough in the
general case, since what comes before the string may affect weak type
resolution, but since IRIs generally start with a latin letter
(generally 'h' :) this isn't really much of a problem.

So lets for the moment assume that the display model is that IDNs,
IRIs, IMAs are rendered at an even embedding level, such that the
IDN/IRI/IMA constitutes the sole text in the level run.  (This can
easily be achieved by bracketing the string with LRE and PDF prior to
rendering.)

Consider the domain:

123.ARAB.com (logical order)
123.BARA.com (display order)

now consider the domain:

ARAB.123.com (logical order)
123.BARA.com (display order)

Ergo, we need another display model; this one doesn't work, at least
not if we don't want two completely different domains to display
identically.

I recall that there was a proposal on the IDN list that domains should
always be rendered with the labels appearing in order, least
significant to the left and top-level domain on the right.  (This can
be trivially achieved by bracketing each label with LRE/PDF,
separating the labels with dots, and then bracketing the whole domain
with LRE/PDF.)

This would solve the above problem, but potentially might be less
friendly to users of RTL languages in other ways.

It also clearly is not what the authors of stringprep had in mind,
since the bidi restrictions in stringprep are much stronger than would
be necessary if this was the model.

	-roy
Received on Sunday, 7 September 2003 10:09:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:52 GMT