- From: Klaus Johannes Rusch <KlausRusch@atmedia.net>
- Date: Sun, 6 Jan 2002 20:50:28 CET
- To: <html-tidy@w3.org>
- Cc: <derhoermi@gmx.net>
In <200201061911.OAA11308@tux.w3.org>, Klaus Johannes Rusch <KlausRusch@atmedia.net> writes: > In <3C38860C.000001.28231@cca-laptop>, "Gary J. Piccoli" <gpiccoli@bellatlantic.net> writes: > > I have attached the HTML output from Access 2000 for a simple file > > maintenance page. I turned on all the XML options for my Tidy test. I get > > my first error on line 13. Let me know what you think. > > The problem is the attribute which Microsoft Access uses to store an XML > fragment, this contains a large number of < and > characters, which makes tidy > assume there is a missing double quote in an attribute. > > About the shortest possible test case is > > <PARAM NAME="XMLData" VALUE="<xml xmlns:a="urn:schemas-microsoft-com:office:access"> > <a:something></a:something><a:something></a:something></xml>'""> > > I have uploaded a Bug report and test case 500236 for this, the fix should be a > simple change to the heuristics in lexer.c. This fixes the problem: diff -r1.57 lexer.c 3089c3089,3091 < javascript URL scheme which may legitimately include < and > --- > javascript URL scheme which may legitimately include < and >, > and for attributes starting with "<xml " as generated by > Microsoft Office. 3090a3093 > 3092c3095,3096 < !(IsUrl(name) && wstrncmp(lexer->lexbuf+start, "javascript:", 11) == 0)) --- > !(IsUrl(name) && wstrncmp(lexer->lexbuf+start, "javascript:", 11) == 0) && > !(wstrncmp(lexer->lexbuf+start, "<xml ", 5) == 0)) Unfortunately since sourceforge is down I cannot upload the changes currently. -- Klaus Johannes Rusch KlausRusch@atmedia.net http://www.atmedia.net/KlausRusch/
Received on Sunday, 6 January 2002 14:50:13 UTC