- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Tue, 19 Feb 2013 20:15:38 +0100
- To: iab@iab.org
Hi, Re <http://tools.ietf.org/html/draft-iab-identifier-comparison-07>, in section 3.3 there is Also, when a URI is embedded in plain text (e.g., an email message), there is an additional concern because there is no termination criterion for a URI. For example, consider http://unicode.org/cldr/utility/list-unicodeset.jsp?a=a&g=gc. Some applications that detect URIs will stop before the first '.' in the path, while others go to last '.', and yet others may stop at the ';'. As another point of comparison, Section 2.37 of [EE] (a standard for history citations) specifies the use of a space after a URI and before the punctuation. It's unclear to me whether the `&` in there is intentional or an en- coding error. If it is intentional, that should be made very explicit. I also find the claim a bit dubious, STD 66 quite clearly recommends using <> around them and you could use white space aswell. More generally this seems to be a bit far-fetched as an issue in "comparison", this is more discussing applying heuristics to extract data from ambiguous text. Per- haps the document can do without this paragraph. Section 3.1 on hostnames seems to be missing the issue of "example.com" versus "example.com." with a trailing full stop; it might be useful to mention it there. In section 3.3.2.3., [RFC3986] defines the userinfo production that allows arbitrary data about the user of the URI to be placed before '@' signs in URIs. For example: "http://alice:bob:chuck@example.com/bar" has the value "alice:bob:chuck" as its userinfo. [...] This is somewhat misleading as it fails to mention that while the generic syntax allows this, individual schemes like the HTTP scheme, as currently defined in RFC 2616, do not allow this. It might be better to pick a scheme that actually allows this form. Section 3.3.3, [RFC3986] supports the use of path segment values such as "./" or "../" for relative URIs. Strictly speaking, including such path segment values in a fully qualified URI is syntactically illegal but [RFC3986] section 4.1 nevertheless defines an algorithm to remove them. This should include a reference to STD 66 indicating where it defines them as illegal (I could not find that myself, so the text might be mistaken). The reference [TR36] should link to http://www.unicode.org/reports/tr36/ or some other suitable address (currently it does not link anything). regards, -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Tuesday, 19 February 2013 19:16:08 UTC