- From: John C Klensin <john-ietf@jck.com>
- Date: Tue, 02 Mar 2010 16:18:32 -0500
- To: Larry Masinter <LMM@acm.org>
- cc: public-iri@w3.org
--On Saturday, February 27, 2010 22:25 -0800 Larry Masinter <LMM@acm.org> wrote: > Going through the Security considerations of > of draft-ietf-idnabis-defs-13 vs. the current > "Security Considerations" of the current IRI document > > here's looking at > http://tools.ietf.org/html/draft-ietf-idnabis-defs > section 4: >... Larry, Suggestions, fwiw (mostly drawing comments from other notes together): (1) Reference that doc. As others have pointed out, it addresses UTR 36, but contains some material that may be more directly relevant to IRIs generally and their domain name components in particular. (2) Point out that neither of those documents (...idnabis-defs nor UTR36) really addresses "sound alike" (especially to someone not familiar with the relevant language) issues rather than "look alike" or "might be expected to be treated alike" ones. In conjunction with this, note that the problem is not just with the false positive comparisons that characterize the spoofing problem but with perceptual false negatives: people who are under the delusion the IRIs (or URIs or domain names) are to be interpreted by humans and who are not computer experts often expect orthographic variations to compare equal. Difference in US and UK spelling, Simplified and Traditional Chinese and maybe pinyin, conventions about representation of extended Latin strings in basic Latin characters, and writing of Japanese in either Kana or Kanji all fall into that category for at least some populations. (3) Note that these are problems for _both_ humans and human perception and user agents that try to guess at strings and other issues with which humans might have problems so that the users can be warned. You've noted the example of trying to distinguish between familiar and unfamiliar scripts. Others have noted that mixed-script situations and the use of some specific characters can be problematic. For example, as a problem very specific to IRIs, there are many characters in Unicode that could plausibly be confused with forward slashes and other reserved punctuation. (4) Of course, we also have the human interface design question of whether or not one should try to do anything (and possibly create false expectations and an unreasonable sense of confidence in being protected) when it is clear that a comprehensive solution is impossible. If one inspects browsers and other IRI-using programs, the consensus seems to be "yes, do what one can". That is not the only plausible conclusion and there is certainly no consensus as to what one should actually do. I think it would be wise for the document to say that. best, john
Received on Tuesday, 2 March 2010 21:19:03 UTC