W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2002

Re: Tidy and Access 2000

From: Klaus Johannes Rusch <KlausRusch@atmedia.net>
Date: Sun, 6 Jan 2002 20:50:28 CET
Message-Id: <200201061950.OAA14519@tux.w3.org>
To: <html-tidy@w3.org>
Cc: <derhoermi@gmx.net>
In <200201061911.OAA11308@tux.w3.org>, Klaus Johannes Rusch <KlausRusch@atmedia.net> writes:
> In <3C38860C.000001.28231@cca-laptop>, "Gary J. Piccoli" <gpiccoli@bellatlantic.net> writes:
> > I have attached the HTML output from Access 2000 for a simple file
> > maintenance page.  I turned on all the XML options for my Tidy test.  I get
> > my first error on line 13.  Let me know what you think.
> 
> The problem is the attribute which Microsoft Access uses to store an XML
> fragment, this contains a large number of < and > characters, which makes tidy
> assume there is a missing double quote in an attribute.
> 
> About the shortest possible test case is
> 
> <PARAM NAME="XMLData" VALUE="<xml xmlns:a=&quot;urn:schemas-microsoft-com:office:access&quot;>&#13;&#10;
> <a:something></a:something><a:something></a:something></xml>'&quot;">
> 
> I have uploaded a Bug report and test case 500236 for this, the fix should be a
> simple change to the heuristics in lexer.c.

This fixes the problem:

diff -r1.57 lexer.c
3089c3089,3091
<            javascript URL scheme which may legitimately include < and >
---
>            javascript URL scheme which may legitimately include < and >,
>            and for attributes starting with "<xml " as generated by
>            Microsoft Office.
3090a3093
> 
3092c3095,3096
<             !(IsUrl(name) && wstrncmp(lexer->lexbuf+start, "javascript:", 11) == 0))
---
>             !(IsUrl(name) && wstrncmp(lexer->lexbuf+start, "javascript:", 11) == 0) &&
>                 !(wstrncmp(lexer->lexbuf+start, "<xml ", 5) == 0))


Unfortunately since sourceforge is down I cannot upload the changes currently.

-- 
Klaus Johannes Rusch
KlausRusch@atmedia.net
http://www.atmedia.net/KlausRusch/
Received on Sunday, 6 January 2002 14:50:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:51 GMT