- From: Chris Beall <Chris_Beall@prodigy.net>
- Date: Wed, 1 Mar 2006 18:13:42 -0500
- To: "Amaya users" <www-amaya@w3.org>
> -----Original Message----- > From: www-amaya-request@w3.org [mailto:www-amaya-request@w3.org]On > Behalf Of Chris Beall > Sent: Wednesday, March 01, 2006 11:58 AM > To: Amaya users > Subject: 'Unknown entity' messages within href= URLs containing query > data > > > > When fed a page containing something like: > <a href="http://finance.yahoo.com/q/bc?s=IBM&t=5y"> > Amaya places into the PARSING.ERR file the message "line nnn, char nn: > Unknown entity". > > This seems to be because it interprets the '&' within the query data to > represent the beginning of an entity reference. When it hits the "=", it > deems the entity reference to have been terminated and further > deems "&t" to > be unknown as an entity reference. > > I believe this behavior is incorrect. Referring to > http://www.gbiv.com/protocols/uri/rfc/rfc3986.html > > I see: > query = *( pchar / "/" / "?" ) > > where pchar = unreserved / pct-encoded / sub-delims / ":" / "@" > and sub-delims = "!" / "$" / "&" / "'" / "(" / ")" > / "*" / "+" / "," / ";" / "=" > > Since the '&' (ampersand) is a amongst the sub-delims, I deduce > that it does > NOT need to be percent-encoded when used within the query portion > of a URI. > It seems to be common practice not to encode it. > > In addition, between the quotes of an href=, we are no longer dealing with > HTML, where character entity references live, but with a URI. > > It therefore appears to me that Amaya should not look for entity > references > within URIs and should not issue the error message cited above. > > Chris Beall Thanks to Dave Woolley for setting me straight on this. In spite of the fact that the syntax I provided is VERY common in the wild, Amaya is correct to flag it as an error. See http://www.w3.org/TR/html401/appendix/notes.html#h-B.2.2. The only trick here is that the HTML spec refers to FORM submission and it may not be obvious that whenever you put a URI containing query data into HTML, you are, in effect, submitting a form, from the perspective of the server. Chris Beall
Received on Wednesday, 1 March 2006 23:16:37 UTC