- From: Larry W. Virden <lvirden@cas.org>
- Date: Wed, 17 Oct 2001 07:28:20 -0400 (EDT)
- To: <html-tidy@w3.org>
From: Bjoern Hoehrmann <derhoermi@gmx.net> > * Live with the recommended UTF-8/URI escaping > (see e.g. http://www.w3.org/International/O-URL-and-ident.html) : > Tidy is required to escape URIs like it does by various specifications, > especially HTML 4 and http://www.w3.org/TR/charmod/ I am sorry if this > causes any trouble (I haven't checked this for mailto:-URIs), but > non-ASCII characters are invalid in URIs and you shouldn't have used > them. Does anyone know of a technical document that might discuss the appropriate behavior by a program parsing html that indicates appropriate alternatives for handling invalid escapes? For instance, if a program hits the html string <A HREF="http://www.somestory.com/story1.html">hit&run accident</a> what are the recommended (or perhaps required) behaviors in interpreting &run? Some applications seem to leave things alone, some delete the invalid escapes, and some replace the escape with an 'error' character... Are all these 'correct' behaviors? -- Never apply a Star Trek solution to a Babylon 5 problem. Larry W. Virden <mailto:lvirden@cas.org> <URL: http://www.purl.org/NET/lvirden/> Even if explicitly stated to the contrary, nothing in this posting should be construed as representing my employer's opinions. -><-
Received on Wednesday, 17 October 2001 07:28:52 UTC