- From: Christopher R. Maden <crm@ebt.com>
- Date: Mon, 26 Aug 1996 14:37:53 GMT
- To: www-html@w3.org
Foteos Macrides writes about URL attributes. There is a great deal of confusion about the difference between an "attribute value specification" (AVS, for this mail message), and an "attribute value" (AV). A marked-up document contains only AVSs. After, and only after, the AVS has been parsed, is an AV arrived at, and passed by the parser to the application. All AVSs are replaceable character data (RCD), in which entity references should be resolved. In RCD, any ampersand followed by a name start character is an entity reference. If it is not a reference to a defined entity, it is an error. After resolution of any entities, including the handling of any errors, an AV is arrived at, and passed to the application. Practical application: In an HTML document, the string in quotes following the string "href=" is an AVS. Any entity references should be recognized and resolved. Any string of &[a-zA-Z] that is not a known entity reference is an error. Preferably, the unknown reference should be kept as data. After the resolution of entities, the AV is passed to the application, e.g., Lynx. This AV should be a valid URL, in the case of the <a href=""> attribute. The AVS does *not* have to be a valid URL. In a URL, i.e., in the resolved AV, URL-significant characters should be hex-escaped. In the href= attribute, i.e., in the AVS, SGML-significant characters should be entity-escaped. If I have a script on a DOS-based server called moe&larry, I need to hex-escape the ampersand, because it is a literal, not a semantic character: <URL:http://www.mycom.com/cgi-bin/moe%26larry> If I want to pass that script arguments of guitar=fender and amp=g-k, I do *not* escape the ampersand, because it is a semantic character in the URL: <URL:http://www.mycom.com/cgi-bin/moe%26larry?guitar=fender&=g-k> If I want to encode this URL as a CDATA attribute in an SGML document, say, as the href attribute of an <a> element in an HTML document, I must turn the ampersand into an entity reference, so that the AVS, *after* parsing and resolution to an AV, will correspond to the same URL. <a href="http://www.mycom.com/cgi-bin/moe%26larry?guitar=fender&amp=g-k> A proper HTML parser will turn this AVS into an AV - a URL: http://www.mycom.com/cgi-bin/moe%26larry?guitar=fender&=g-k When the link is selected, an HTTP connection will be established with the server "www.mycom.com", and the HTTP request sent: GET /cgi-bin/moe%26larry?guitar=fender&=g-k HTTP/1.0 The server will run the script "moe&larry", and pass it the parameter string "guitar=fender&=g-k". Most CGI scripts will use the & to separate parameters. I know this explanation was pedantic, but I hope it helped at least one person understand the relationship between URLs and the href attribute. -Chris -- <!NOTATION SGML.Geek PUBLIC "-//GCA//NOTATION SGML Geek//EN"> <!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM "<URL>http://www.ebt.com <TEL>+1.401.421.9550 <FAX>+1.401.521.2030 <USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>
Received on Monday, 26 August 1996 10:46:28 UTC