W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2008

Re: asxml produces invalid XML

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Mon, 23 Jun 2008 18:19:56 +0200
To: Vaclav Barta <vbar@comp.cz>
Cc: html-tidy@w3.org
Message-ID: <j3jv54dl7v9t5svt5esuhaqn7fr8n4k6q7@hive.bjoern.hoehrmann.de>

* Vaclav Barta wrote:
>provede registraci online <span style=
>"FONT-SIZE: 12pt; FONT-FAMILY:" times="" mso-fareast-font-family:=
>"Times" new="" mso-ansi-language:="" mso-fareast-language:=""
>mso-bidi-language:=""><a href=
>"http://www.ibm.com/services/servicepac"><strong>na
>adrese</strong></a></span>
></body>
></html>
>
>which obviously not only isn't valid XHTML (and tidy knows that, warns about 
>proprietary attributes yet insists on the doctype and namespace 
>declarations), but isn't even XML - some synthetised attributes end with a 
>colon.

This is actually allowed, it's only the Namespaces in XML Recommendation
that considers this malformed. You may be able to turn namespace support
off in your parser and strip the attributes, or ignore them. Further,
you can use the --drop-proprietary-attributes (or whatever is called)
option to drop them (and other attributes). Other than that Tidy has not
so many choices here to produce better-formed XML, it could only strip
the attributes. Perhaps that merits some configuration option though.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Monday, 23 June 2008 16:20:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:59 GMT