Re: Amaya XHTML chokes on URL ampersand delimiters

/Steve White/:

> In pages to be parsed as XHTML, Amaya fails on URL's containing 
> ampersand ("&") 
> delimiters.
> 
> For example, <a href="http://a.b.org/c?d&e">URL with search string</a>
> 
> Amaya complains "not well-formed (invalid token)", after the token 
> following the 
> first ampersand.

That's really not well-formed. You might read the XML specification. 
As in SGML, & is markup character (used to delimit general entity 
references) therefore you need to substitute it with a built-in 
general entity reference &amp; when you wish it included as part of 
the character data. Just like you use other built-in general entity 
references for other markup significant characters &lt; for <, &gt; 
for >, &quot; for ", &apos; for '.

> Without the XHTML DOCTYPE, this does not happen.
> 
> For a real-life example, see:
> <a 
> href="http://www.kindernetz.de/oli/tierlexikon/index.php?tid=115&reiter=steckbrief"> 
> Eurasian Coot.</>

I'm not really sure about the SGML rules (if malformed or undeclared 
entity references are automatically discarded), but in this case of 
"compatibility" HTML parsing (given there's no DOCTYPE declaration 
included) Amaya just tries to recover from the error you've made.

-- 
Stanimir

Received on Sunday, 6 November 2005 14:26:12 UTC