- From: Paul Hoffman / IMC <phoffman@imc.org>
- Date: Tue, 8 Apr 2003 08:14:57 -0700
- To: public-iri@w3.org
[[ I sent this to Martin a few weeks ago, but we agreed that it was best to bring up on this new list. And, just to be clear, I don't think that IRIs should be debated to death, but the document needs to be clear. ]] I think the document is fairly good, although I'm not much of a URI person, so I could be way off. After studying it this morning, I see where I got confused. I also disagree with some technical choices you make later in the spec. Clarifications: Section 1.1 needs to be a bit longer, and possibly split into two parts. You need more emphasis here that you are describing something that will go into protocol elements. In addition, you need an explicit discussion here about the difference between characters and encoded characters. You have this in section 2, but it is so important to understanding the applicability, it needs to be in 1.1. Subsection (c) in 1.2 is a mess and is probably where I really lost it. The first sentence has too many subordinate clauses in it. But worse, you introduce UTF-8. That's where I got confused about characters vs. encoding. I still don't know why UTF-8 is brought up here. I propose that you start over on this subsection. The last paragraph of 1.2 is confusing in the middle where you talk about UTF-8. 0xE9 is not the representation of a UTF-8 character. Even though the example is wrong, it got me stuck in UTF-8 mode, which helped get me stuck in thinking that you were talking sometimes about the encoding. Technical issues: You use NFC in Section 3.1. This goes against the theme of the guidelines in section 6. NFKC will cause less surprise if an IRI contains compatibility characters, so you should use NFKC instead, regardless of the history of NFC in the W3C. I do not understand the logic of having Variants (B) and (C) in step 1 in section 3.1. One is normalized, the other one isn't. Doesn't this sound like a recipe for disaster? Why did you differentiate between these two cases? Does the bidi processing in section 4 match what is specified in Nameprep? If not, are there cases where a stand-alone IDN name will be displayed differently than the same name in an IRI? That would be a complete show-stopper, if true. I think that section 5.1 (b) is a bad mistake. The four reasons you give are not strong enough for what seems like something that can cause huge conversion problems. I can also see this causing security problems. --Paul Hoffman, Director --Internet Mail Consortium
Received on Tuesday, 8 April 2003 12:06:08 UTC