W3C home > Mailing lists > Public > public-iri@w3.org > March 2010

RE: BIDI IRI Display (was spoofing and IRIs)

From: Shawn Steele <Shawn.Steele@microsoft.com>
Date: Wed, 3 Mar 2010 17:12:57 +0000
To: Slim Amamou <slim@alixsys.com>, Larry Masinter <LMM@acm.org>
CC: "public-iri@w3.org" <public-iri@w3.org>, Peter Constable <petercon@microsoft.com>, " (unicode@unicode.org)" <unicode@unicode.org>
Message-ID: <E14011F8737B524BB564B05FF748464A0565993D@TK5EX14MBXC139.redmond.corp.microsoft.com>
> An IRI is a sequence of Unicode characters. Is there not
> already a well-defined way of converting a sequence of
> Unicode characters to a visual display?

The problem (from my perspective at least) is that the Unicode BIDI rules are somewhat "generic".  Unicode expects things like / and . to be used in a context of same-script stuff, like a date, time or number.  IRIs use them as delimiters for a list of elements (labels in the domain name or folders in the path), in a hierarchical form.  The Unicode BIDI algorithm doesn't recognize that there's an underlying hierarchy, so it can end up "swapping" pieces in that hierarchy in some cases.

I'm not sure UTR#36 is the proper place to clarify display of such ordered lists.  Proper BIDI rendering of IRIs isn't just a security, but also a usability, problem.  It does seem like perhaps this concept should be mentioned in Unicode somewhere.  (IRIs aren't the only place that similar ordered lists happen).

-Shawn
Received on Wednesday, 3 March 2010 17:13:49 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:41 UTC