Re: uri handling of hosts is too restrictive

On Fri, 2004-02-06 at 02:46, Graham Klyne wrote:
> DNS specifies a design for a very general distributed lookup system, not 
> just host names, so it's not appropriate to look to that for the 
> appropriate range of allowed characters.
Yep RFC 2181 "11. Name syntax

   Occasionally it is assumed that the Domain Name System serves only
   the purpose of mapping Internet host names to data, and mapping
   Internet addresses to host names.  This is not correct, the DNS is a
   general (if somewhat limited) hierarchical database, and can store
   almost any kind of data, for almost any purpose."
> I understand the appropriate specifications are RFC 952, as modified by RFC 
> 1123:
RFC2181 section 11 references RFC1123 section 6.1.3.5 which references
RFC 1123 section 2.1 which refers to RFC 952. So it is a strong
recommendation.

Also though RFC 1537 refers to both RFC 822 and RFC 1034.
"8. Hostnames

   People appear to sometimes look only at STD 11, RFC 822 to determine
   whether a particular hostname is correct or not. Hostnames should
   strictly conform to the syntax given in STD 13, RFC 1034 (page 11),
   with *addresses* in addition conforming to RFC 822. As an example
   take "c&w.blues" which is perfectly legal according to RFC 822, but
   which can have quite surprising effects on particular systems, e.g.,
   "telnet c&w.blues" on a Unix system."
RFC 1034
"3.5. Preferred name syntax

The DNS specifications attempt to be as general as possible in the rules
for constructing domain names.  The idea is that the name of any
existing object can be expressed as a domain name with minimal changes.
However, when assigning a domain name for an object, the prudent user
will select a name which satisfies both the rules of the domain system
and any existing rules for the object, whether these rules are published
or implied by existing programs.

For example, when naming a mail domain, the user should satisfy both the
rules of this memo and those in RFC-822.  When creating a new host name,
the old rules for HOSTS.TXT should be followed.  This avoids problems
when old software is converted to use domain names."

RFC 822
     domain      =  sub-domain *("." sub-domain)

     sub-domain  =  domain-ref / domain-literal

     domain-ref  =  atom                         ; symbolic reference
atom        =  1*<any CHAR except specials, SPACE and CTLs>
domain-literal =  "[" *(dtext / quoted-pair) "]"
dtext       =  <any CHAR excluding "[",     ; => may be folded
                     "]", "\" & CR, & including
                     linear-white-space>
quoted-pair =  "\" CHAR         ; may quote any char
specials    =  "(" / ")" / "<" / ">" / "@"  ; Must be in quoted-
                 /  "," / ";" / ":" / "\" / <">  ;  string, to use
                 /  "." / "[" / "]"              ;  within a word.
CTL         =  <any ASCII control           ; (  0- 037,  0.- 31.)
                     character and DEL>          ; (    0177,     127.)
CHAR        =  <any ASCII character>        ; (  0-0177,  0.-127.)

So it's my understanding that lots of names are legal, just not
recommended. There are a lot of thing that would read SHOULD in newer
style RFCs. Please also note "Preferred". Also at issue is that the uri
spec SHOULD be neutral as to what particular host name lookup
technologies and restrictions a particular uri resolution implementation
may choose to use. I might use DNS, Host Tables, yp/nis/nis+, etc. I
might have names from a legacy system, I might be some goof ball that
just wants an innocuous _ ;->
-- 
--
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://slashdot.org/~joe_plastic/
http://stephen_pollei.home.comcast.net/
GPG Key fingerprint = EF6F 1486 EC27 B5E7 E6E1  3C01 910F 6BB5 4A7D 9677

Received on Friday, 6 February 2004 15:15:59 UTC