W3C home > Mailing lists > Public > public-iri@w3.org > March 2010

RE: BIDI IRI Display (was spoofing and IRIs)

From: Jonathan Rosenne <rosennej@qsm.co.il>
Date: Thu, 4 Mar 2010 11:50:58 +0200
To: "'\"Martin J. D?rst\"'" <duerst@it.aoyama.ac.jp>
Cc: "'Shawn Steele'" <Shawn.Steele@microsoft.com>, "'Larry Masinter'" <LMM@acm.org>, "'Slim Amamou'" <slim@alixsys.com>, <public-iri@w3.org>, "'Peter Constable'" <petercon@microsoft.com>, <unicode@unicode.org>
Message-ID: <000a01cabb80$3326cfb0$99746f10$@co.il>
Give me a snail mail address and I'll send some newspapers via the post.

Jony

> -----Original Message-----
> From: "Martin J. D?rst" [mailto:duerst@it.aoyama.ac.jp]
> Sent: Thursday, March 04, 2010 11:44 AM
> To: Jonathan Rosenne
> Cc: 'Shawn Steele'; 'Larry Masinter'; 'Slim Amamou'; public-iri@w3.org;
> 'Peter Constable'; unicode@unicode.org
> Subject: Re: BIDI IRI Display (was spoofing and IRIs)
> 
> Hello Jonny,
> 
> On 2010/03/04 17:13, Jonathan Rosenne wrote:
> > There is no average BIDI user to observe, since there are no BIDI
> TLDs and
> > no BIDI equivalents to http, ftp etc.
> >
> > In my way of thinking, and average BIDI user does not normally mix
> LTR and
> > RTL, programmers excepted.
> 
> Can you expand on this a bit more? E.g. how much do LTR
> words/phrases/sentences/whatever appear in average RTL (e.g. Hebrew or
> Arabic) text? How much in newspapers? How much in books? How much in
> Web
> pages? How much in informative text vs. advertisements,...?
> 
> Regards,    Martin.
> 
> > Jony
> >
> >> -----Original Message-----
> >> From: public-iri-request@w3.org [mailto:public-iri-request@w3.org]
> On
> >> Behalf Of Shawn Steele
> >> Sent: Thursday, March 04, 2010 7:56 AM
> >> To: Larry Masinter; 'Slim Amamou'
> >> Cc: public-iri@w3.org; Peter Constable; unicode@unicode.org
> >> Subject: RE: BIDI IRI Display (was spoofing and IRIs)
> >>
> >> The problem isn't an IRI in different contexts (a list of IRIs or
> not),
> >> the problem is that an IRI *IS* a list.
> >>
> >> http://www.microsoft.com/en/us/default.aspx is a lot like { www,
> >> microsoft, com, en, us, default.aspx }, so IRI's shouldn't mix up
> the
> >> parts, (eg: reversing en&  us in the display would be misleading).
> In
> >> a BIDI context, this probably means that the elements of the list
> are
> >> ordered from right to left.  The problem with the Unicode bidi
> >> algorithm is that if 2 LTR script elements are adjacent, they lose
> the
> >> ordering of the list.
> >>
> >> Users seem to expect that elements of an IRI are drawn as a list
> like I
> >> described.  It has also been proposed that they just be rendered
> from
> >> LTR regardless of whether any labels are RTL or not, and another
> >> suggestion has been that users don't really understand the ordering
> of
> >> the IRI, so it's okay to reorder as long as it's consistent.
> >>
> >> I would like to see a usability study to figure out what the average
> >> BIDI user expects since us engineers may have biases that most
> people
> >> don't have.  My informal observations and feedback from the BIDI
> >> community seems to support the "elements of a list" hypothesis,
> however
> >> I'd like that to be confirmed (or disproved) by a "real" usability
> >> study :)
> >>
> >> -Shawn
> >>
> >> ________________________________________
> >> From: Larry Masinter [masinter@gmail.com] on behalf of Larry
> Masinter
> >> [LMM@acm.org]
> >> Sent: Wednesday, March 03, 2010 6:00 PM
> >> To: Shawn Steele; 'Slim Amamou'
> >> Cc: public-iri@w3.org; Peter Constable; unicode@unicode.org
> >> Subject: RE: BIDI IRI Display (was spoofing and IRIs)
> >>
> >> If the same Unicode string is used for an IRI in running text and
> for
> >> an IRI in a context where its use as a "ordered list", then it would
> >> seem like
> >>
> >> * the presentation of the IRI in different contexts is the same
> >>
> >> is more important than
> >>
> >> * the presentation of the IRI in known IRI contexts is optimal
> >>
> >> Do you agree? I don't see how you can have both.
> >>
> >> Larry
> >> --
> >> http://larry.masinter.net
> >>
> >>
> >> -----Original Message-----
> >> From: Shawn Steele [mailto:Shawn.Steele@microsoft.com]
> >> Sent: Wednesday, March 03, 2010 9:13 AM
> >> To: Slim Amamou; Larry Masinter
> >> Cc: public-iri@w3.org; Peter Constable; (unicode@unicode.org)
> >> Subject: RE: BIDI IRI Display (was spoofing and IRIs)
> >>
> >>> An IRI is a sequence of Unicode characters. Is there not
> >>> already a well-defined way of converting a sequence of
> >>> Unicode characters to a visual display?
> >>
> >> The problem (from my perspective at least) is that the Unicode BIDI
> >> rules are somewhat "generic".  Unicode expects things like / and .
> to
> >> be used in a context of same-script stuff, like a date, time or
> >> number.  IRIs use them as delimiters for a list of elements (labels
> in
> >> the domain name or folders in the path), in a hierarchical form.
> The
> >> Unicode BIDI algorithm doesn't recognize that there's an underlying
> >> hierarchy, so it can end up "swapping" pieces in that hierarchy in
> >> some cases.
> >>
> >> I'm not sure UTR#36 is the proper place to clarify display of such
> >> ordered lists.  Proper BIDI rendering of IRIs isn't just a security,
> >> but also a usability, problem.  It does seem like perhaps this
> concept
> >> should be mentioned in Unicode somewhere.  (IRIs aren't the only
> place
> >> that similar ordered lists happen).
> >>
> >> -Shawn
> >
> >
> >
> 
> --
> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Thursday, 4 March 2010 09:51:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:41 UTC