- From: Larry Masinter <LMM@acm.org>
- Date: Thu, 4 Mar 2010 07:53:50 -0800
- To: "'Shawn Steele'" <Shawn.Steele@microsoft.com>, "'Slim Amamou'" <slim@alixsys.com>
- Cc: <public-iri@w3.org>, "'Peter Constable'" <petercon@microsoft.com>, <unicode@unicode.org>
Shawn, I don't think I was clear enough. You say: "The problem isn't an IRI in different contexts (a list of IRIs or not), the problem is that an IRI *IS* a list." No, I'm sorry. An IRI is a sequence of unicode characters. Some people may think of an IRI as a list, others think of an IRI as a magical and meaningless incantation. But those things you're talking about are higher level semantic interpretations. There are two processes in place: (1) transform IRI as sequence of unicode characters to visual presentation (2) transform iRI as (sequence of unicode characters, interpreted as a list) to visual presentation What is *optimum* and *best* and *most accessible* and *user friendly* for (2) may well be different than what is best for (1). HOWEVER: I think it is more important that the results of (1) and (2) be the SAME than it is that (2) be optimum. If you disagree with that premise, then we can talk about what is optimal for (2) and how we will mitigate the damage from the possibility that (1) is different than (2). I will note, in passing, that it *does* seem like some browsers, when you copy a *URI* from the browser address bar and paste into some other window, the spaces will be converted to %20, i.e., there's at least a character level transformation, which kind of makes sense in context. That is, there might be a separate kind of user interface element which is an "IRI explanation", which doesn't use the normal Unicode -> visual display but instead has some graphical representation based on showing the individual parsed components of the IRI (oh, put the HOST in a red box and the PATH in a blue box and the scheme in a tiny font off to the right.) I will also note that I think this is an area of "best practice" that is likely to, and should be allowed to, evolve more rapidly than the base IRI protocol element which we are trying to define quickly, that best practice can vary from browser to browser without any need to standardize this, as it is a user interface element along with tabs and flashing history lists, and in any case, I think belongs in a separate document. If UTC 36 is not that document, then I would suggest putting it in a separate one. IMHO, Larry -- http://larry.masinter.net ________________________________________ From: Larry Masinter [masinter@gmail.com] on behalf of Larry Masinter [LMM@acm.org] Sent: Wednesday, March 03, 2010 6:00 PM To: Shawn Steele; 'Slim Amamou' Cc: public-iri@w3.org; Peter Constable; unicode@unicode.org Subject: RE: BIDI IRI Display (was spoofing and IRIs) If the same Unicode string is used for an IRI in running text and for an IRI in a context where its use as a "ordered list", then it would seem like * the presentation of the IRI in different contexts is the same is more important than * the presentation of the IRI in known IRI contexts is optimal Do you agree? I don't see how you can have both. Larry -- http://larry.masinter.net -----Original Message----- From: Shawn Steele [mailto:Shawn.Steele@microsoft.com] Sent: Wednesday, March 03, 2010 9:13 AM To: Slim Amamou; Larry Masinter Cc: public-iri@w3.org; Peter Constable; (unicode@unicode.org) Subject: RE: BIDI IRI Display (was spoofing and IRIs) > An IRI is a sequence of Unicode characters. Is there not > already a well-defined way of converting a sequence of > Unicode characters to a visual display? The problem (from my perspective at least) is that the Unicode BIDI rules are somewhat "generic". Unicode expects things like / and . to be used in a context of same-script stuff, like a date, time or number. IRIs use them as delimiters for a list of elements (labels in the domain name or folders in the path), in a hierarchical form. The Unicode BIDI algorithm doesn't recognize that there's an underlying hierarchy, so it can end up "swapping" pieces in that hierarchy in some cases. I'm not sure UTR#36 is the proper place to clarify display of such ordered lists. Proper BIDI rendering of IRIs isn't just a security, but also a usability, problem. It does seem like perhaps this concept should be mentioned in Unicode somewhere. (IRIs aren't the only place that similar ordered lists happen). -Shawn
Received on Thursday, 4 March 2010 15:54:33 UTC