- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 08 Sep 2003 15:50:03 -0400
- To: Roy Badami <roy@gnomon.org.uk>, ietf-imaa@imc.org
- Cc: public-iri@w3.org
Hello Roy, I think that in general, you are right about your analysis. Having labels (or other components) with numbers only may lead to ambiguous displays. I seem to remember that we were actually aware of that fact, but there was not much to do about it: - There currently are labels with only digits in the DNS, outlawing them is not an option. (it would have been nice if we could have said that the same restrictions apply for digits and LTR letters as they do for digits and RTL letters) - Very explicitly for IDN, but also in many ways for IRIs, it is highly undesirable to have inforced restrictions on two or more labels/components. (note that this may be somewhat different for the LHS side) I have created an issue for this for the IRI draft, at http://www.w3.org/International/iri-edit#bidiDigits-18. I propose to address this by adding text that points out such cases and warns against them (without going as far as actually prohibiting them). I hope that this is acceptable for you. By the way, the alternative of having components displayed strictly LTR was what we had for a long time. The two problems with this approach are: - It does not seem to correspond with what Arabic and Hebrew writers do naturally, in particular for freestanding domain names. - It would require much more control over the contexts of IRI display than we think will be available (if we get an overall context of LTR reasonably widely implemented, I think we already have achieved something). Regards, Martin. At 15:08 03/09/07 +0100, Roy Badami wrote: >Ok, I have a problem with what I understand to be the display model >for IDNA and IRI (and presumably by extension IMA). > >I'm assuming that the display model is 'render using bidi in an LTR >context'. > >Specifically, the IRI draft says: > > When rendered, bidirectional IRIs MUST be rendered using the Unicode > Bidirectional Algorithm [UNIV4], [UNI9]. Bidirectional IRIs MUST be > rendered with an overall left-to-right (ltr) direction. > >The latter requirement isn't specified in bidi-speak, but is >presumably to be interpreted as saying they must be rendered at an >even embedding level. Actually, this isn't quite enough in the >general case, since what comes before the string may affect weak type >resolution, but since IRIs generally start with a latin letter >(generally 'h' :) this isn't really much of a problem. > >So lets for the moment assume that the display model is that IDNs, >IRIs, IMAs are rendered at an even embedding level, such that the >IDN/IRI/IMA constitutes the sole text in the level run. (This can >easily be achieved by bracketing the string with LRE and PDF prior to >rendering.) > >Consider the domain: > >123.ARAB.com (logical order) >123.BARA.com (display order) > >now consider the domain: > >ARAB.123.com (logical order) >123.BARA.com (display order) > >Ergo, we need another display model; this one doesn't work, at least >not if we don't want two completely different domains to display >identically. > >I recall that there was a proposal on the IDN list that domains should >always be rendered with the labels appearing in order, least >significant to the left and top-level domain on the right. (This can >be trivially achieved by bracketing each label with LRE/PDF, >separating the labels with dots, and then bracketing the whole domain >with LRE/PDF.) > >This would solve the above problem, but potentially might be less >friendly to users of RTL languages in other ways. > >It also clearly is not what the authors of stringprep had in mind, >since the bidi restrictions in stringprep are much stronger than would >be necessary if this was the model. > > -roy >
Received on Monday, 8 September 2003 15:50:22 UTC