Re: URL character set

Roy T. Fielding (fielding@kiwi.ics.uci.edu)
Fri, 06 Mar 1998 18:30:22 -0800


To: "Sam X. Sun" <ssun@CNRI.Reston.VA.US>
cc: Larry Masinter <masinter@parc.xerox.com>, uri@Bunyip.Com
In-reply-to: Your message of "Fri, 06 Mar 1998 06:14:31 EST."
             <00ca01bd48f1$0d3a88a0$d7019784@ssun2.CNRI.Reston.Va.US> 
Date: Fri, 06 Mar 1998 18:30:22 -0800
From: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Message-ID:  <9803061830.aa11664@paris.ics.uci.edu>
Subject: Re: URL character set 

>Again, I'm confused of what URI syntax is for. Is it for what specified in
>the HTML document, or it is for the URI that get transferred over the wire?

Actually, neither one.  HTML href attributes are CDATA, and thus a URI
in HTML may have &amp; and &#050; thingies *representing* the URI
characters.  Likewise, what goes on the wire depends on which protocol
and field element you look at.

The closest thing to the URI syntax is what gets place in a plain
text file, but even then it is subject to the charset.

....Roy