- From: Tomas Rokicki <rokicki@instantis.com>
- Date: Sat, 02 Jun 2001 12:22:41 +0900
- To: uri@w3.org
RFC 2396 contains the following BNF for the host part of a URI:
host = hostname | IPv4address
hostname = *( domainlabel "." ) toplabel [ "." ]
domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
toplabel = alpha | alpha *( alphanum | "-" ) alphanum
IPv4address = 1*digit "." 1*digit "." 1*digit "." 1*digit
port = *digit
Typical implementations use // and / to locate the hostport part, and
break things apart and use gethostbyname() to resolve the IP address.
Gethostbyname() has quite a different syntax, however, allowing IP
addresses such as
http://63.197.151.31/ (as above; class C syntax)
http://63.197.151.037/ (leading zero means octal, but still within
the BNF of above)
http://63.197.38687/ (two-dot notation; class B syntax)
http://63.12949279/ (one-dot notation; class A syntax)
http://1069913887/ (numeric IP syntax)
and of course all combinations of above, including
http://07761313437/ (octal)
http://000000077.0000000305.000000000227.00000000037/ (leading zeros)
I have two points. First, the implementations are out of sync with the
specification. Does this matter? Secondly, one can argue that the
implied semantics of the BNF given above for a four-dot representation
is a decimal interpretation, where the implementations use octal of any
component of the IP address begins with a leading zero (unlike what happens
for the port, where http://63.197.151.31:0000000080/ accesses port 80).
Any comments? This probably isn't really important, but I thought it
amusing.
-tom
Received on Friday, 1 June 2001 23:23:54 UTC