- From: Shawn Steele <Shawn.Steele@microsoft.com>
- Date: Thu, 4 Mar 2010 20:01:04 +0000
- To: Matitiahu Allouche <matial@il.ibm.com>
- CC: Larry Masinter <LMM@acm.org>, Peter Constable <petercon@microsoft.com>, "public-iri@w3.org" <public-iri@w3.org>, "public-iri-request@w3.org" <public-iri-request@w3.org>, 'Slim Amamou' <slim@alixsys.com>, "unicode@unicode.org" <unicode@unicode.org>
- Message-ID: <E14011F8737B524BB564B05FF748464A0565ACB0@TK5EX14MBXC139.redmond.corp.microsoft.>
That still doesn’t solve Larry’s concerns, since the Unicode bidi algorithm would clearly draw something different than your example ☺ The options would seem to be either a) just use the Unicode bidi algorithm and live with the odd behaviors, or b) try to do something more like users expect, which would first require understanding what they expect. For a) when do you start & when do you stop? In a “plain text” bidi context (eg: notepad) with the Unicode Bidi algorithm, you end up with BBB.AAAhttp://server (from http://server.AAA.BBB where upper case is RTL). I seriously doubt anyone wants http:// in the middle of the string ☺ That can’t be solved without “detecting an IRI”, in which case we can do b). -Shawn From: public-iri-request@w3.org [mailto:public-iri-request@w3.org] On Behalf Of Matitiahu Allouche Sent: Poʻahā, Malaki 04, 2010 1:50 AM To: Shawn Steele Cc: Larry Masinter; Peter Constable; public-iri@w3.org; public-iri-request@w3.org; 'Slim Amamou'; unicode@unicode.org Subject: RE: BIDI IRI Display (was spoofing and IRIs) Unfortunately, usability studies are hard to set up and expensive to run. I once conducted a very informal inquiry on about 20 persons, the majority not engineers but with experience in using computers. The inquiry was about e-mail addresses, not IRIs. The surprising (for me) result was that the majority favored laying out address parts consistently from left to right, even within a global RTL context. For example, consider the following address (in logical order, with upper case letters representing letters from RTL scripts): "DAVE SMITH" <DAVE.SMITH@MY.MAIL.company.com<mailto:DAVE.SMITH@MY.MAIL.company.com>> According to the inquiry result, it should be displayed as follows: "HTIMS EVAD" <EVAD.HTIMS@YM.LIAM.company.com<mailto:EVAD.HTIMS@YM.LIAM.company.com>> whether in a LTR or RTL context. The Israel bureau of standards (SII) has adopted the same position, with a scope including in particular mail addresses, IRIs, file and path names. As always, YMMV. I consider this issue as related to usability much more than to security, but since it has been evoked on this list, the above information may shed light on one point of view. Shalom (Regards), Mati Bidi Architect Globalization Center Of Competency - Bidirectional Scripts IBM Israel Phone: +972 2 5888802 Fax: +972 2 5870333 Mobile: +972 52 2554160 From: Shawn Steele <Shawn.Steele@microsoft.com> To: Larry Masinter <LMM@acm.org>, "'Slim Amamou'" <slim@alixsys.com> Cc: "public-iri@w3.org" <public-iri@w3.org>, Peter Constable <petercon@microsoft.com>, "unicode@unicode.org" <unicode@unicode.org> Date: 04/03/2010 07:57 Subject: RE: BIDI IRI Display (was spoofing and IRIs) Sent by: public-iri-request@w3.org ________________________________ The problem isn't an IRI in different contexts (a list of IRIs or not), the problem is that an IRI *IS* a list. http://www.microsoft.com/en/us/default.aspx is a lot like { www, microsoft, com, en, us, default.aspx }, so IRI's shouldn't mix up the parts, (eg: reversing en & us in the display would be misleading). In a BIDI context, this probably means that the elements of the list are ordered from right to left. The problem with the Unicode bidi algorithm is that if 2 LTR script elements are adjacent, they lose the ordering of the list. Users seem to expect that elements of an IRI are drawn as a list like I described. It has also been proposed that they just be rendered from LTR regardless of whether any labels are RTL or not, and another suggestion has been that users don't really understand the ordering of the IRI, so it's okay to reorder as long as it's consistent. I would like to see a usability study to figure out what the average BIDI user expects since us engineers may have biases that most people don't have. My informal observations and feedback from the BIDI community seems to support the "elements of a list" hypothesis, however I'd like that to be confirmed (or disproved) by a "real" usability study :) -Shawn ________________________________________ From: Larry Masinter [masinter@gmail.com] on behalf of Larry Masinter [LMM@acm.org] Sent: Wednesday, March 03, 2010 6:00 PM To: Shawn Steele; 'Slim Amamou' Cc: public-iri@w3.org; Peter Constable; unicode@unicode.org Subject: RE: BIDI IRI Display (was spoofing and IRIs) If the same Unicode string is used for an IRI in running text and for an IRI in a context where its use as a "ordered list", then it would seem like * the presentation of the IRI in different contexts is the same is more important than * the presentation of the IRI in known IRI contexts is optimal Do you agree? I don't see how you can have both. Larry -- http://larry.masinter.net<http://larry.masinter.net/> -----Original Message----- From: Shawn Steele [mailto:Shawn.Steele@microsoft.com] Sent: Wednesday, March 03, 2010 9:13 AM To: Slim Amamou; Larry Masinter Cc: public-iri@w3.org; Peter Constable; (unicode@unicode.org) Subject: RE: BIDI IRI Display (was spoofing and IRIs) > An IRI is a sequence of Unicode characters. Is there not > already a well-defined way of converting a sequence of > Unicode characters to a visual display? The problem (from my perspective at least) is that the Unicode BIDI rules are somewhat "generic". Unicode expects things like / and . to be used in a context of same-script stuff, like a date, time or number. IRIs use them as delimiters for a list of elements (labels in the domain name or folders in the path), in a hierarchical form. The Unicode BIDI algorithm doesn't recognize that there's an underlying hierarchy, so it can end up "swapping" pieces in that hierarchy in some cases. I'm not sure UTR#36 is the proper place to clarify display of such ordered lists. Proper BIDI rendering of IRIs isn't just a security, but also a usability, problem. It does seem like perhaps this concept should be mentioned in Unicode somewhere. (IRIs aren't the only place that similar ordered lists happen). -Shawn
Received on Thursday, 4 March 2010 20:02:17 UTC