- From: <drtr1@cam.ac.uk>
- Date: Mon, 6 Feb 95 16:41 GMT
- To: uri@bunyip.com
- Cc: drtr1@cus.cam.ac.uk
There seem to be some differences in the URL definitions contained in the draft and in RFC 1738; it is certainly confusing on first reading these documents (with a view to writing a URL parser). Maybe this is because they are defining subtly different objects, although both BNFs define a 'url'. 1. Are national characters allowed in a URL? This seems the most significant difference. RFC 1738 has unreserved = alpha | digit | safe | extra whereas the draft (draft-ietf-uri-relative-url-05.txt) has unreserved = alpha | digit | safe | extra | national Hence the draft allows national characters in most parts of most URLs, whereas the RFC does not. 2. file, ftp and http cannot _always_ be parsed using the generic-RL syntax. In section 2.3, the draft states: > Finally, the following schemes can always be parsed using the > generic-RL syntax. > > file Host-specific Files > ftp File Transfer Protocol > http Hypertext Transfer Protocol > nntp USENET news using NNTP access The generic-RL syntax has a path element defined as segment = *pchar pchar = uchar | ":" | "@" | "&" | "=" with ";" and "?" reserved for delimiting the params and query. However, the RFC allows ";" in an http path segment, and "?" in an ftp or file path segment. In fact, this is not much of a problem if you do not assert that these schemes can _always_ be parsed using the generic-RL syntax. David Robinson. (drtr@ast.cam.ac.uk)
Received on Monday, 6 February 1995 11:42:06 UTC