- From: John Kemp <john.kemp@nokia.com>
- Date: Thu, 21 May 2009 13:33:19 -0400
- To: ext Dan Connolly <connolly@w3.org>
- Cc: "www-tag@w3.org" <www-tag@w3.org>
Hello Dan, On May 21, 2009, at 11:52 AM, ext Dan Connolly wrote: [...] > http://www.w3.org/2001/tag/group/track/actions/265 > > In particular... > > http://www.w3.org/html/wg/href/elab.html > http://www.w3.org/html/wg/href/elab10.html > [...] > > The issues covered are > > Space in Path > Colon in path > Non-ASCII characters in path > Non-ASCII characters in path and query/search > > Larry, I showed you an earlier draft and you weren't too > excited. I still find this is the way my brain needs > to capture issues. > > John, could you take a look at see if I'm making sense, at least? It makes sense in that I think I understood your test cases, and (somewhat) their relationship to the issue at hand. Summarizing my (basic) understanding: * Links are good, and a basic feature of the Web * However, links are used in different contexts (for example, a link in an HTML href is then used to make an HTTP request, specifically in the case of an HTML form submission) ... and the characters, character set and encoding used in one context may not be appropriate in another context * Some specifications defer to 3986 for URI encoding rules. 3986 defers to scheme specifications in particular with regard to "reserved characters". None of the relevant specifications say anything about the use of IRIs in links (correct?) Your examples appear to indicate: i) That a space is not allowed in the path component of an HTTP request, but a space should be escaped as %20 in HTML, as specified by RFC3986 ii) That a colon in the path creates a link which is not useful outside of the context of the document within which it appears (at least, I _think_ that's what you mean here?) iii) That URIs only allow US ASCII characters per RFC3986 I'm not totally sure how these relate directly to the issue we discussed on the 7th (and paraphrasing, hopefully not too terribly, Tim's description) that a document encoded in one character set may contain a link which contains characters encoded with a different character-set - in particular when that link is used in a form submission (other than as a result of adhering to the URI specification rules instead of the IRI specification rules). After reading what I've written, my general feedback seems to be that your examples are interesting, appear that they might be relevant, but could probably be better placed into some context. I've attempted to provide (hopefully not too oversimplified) in this email the context in which I feel your examples make sense. Does that make sense to you? Cheers, - johnk
Received on Thursday, 21 May 2009 17:34:21 UTC