Re: asxml produces invalid XML

* Vaclav Barta wrote:
>provede registraci online <span style=
>"FONT-SIZE: 12pt; FONT-FAMILY:" times="" mso-fareast-font-family:=
>"Times" new="" mso-ansi-language:="" mso-fareast-language:=""
>mso-bidi-language:=""><a href=
>"http://www.ibm.com/services/servicepac"><strong>na
>adrese</strong></a></span>
></body>
></html>
>
>which obviously not only isn't valid XHTML (and tidy knows that, warns about 
>proprietary attributes yet insists on the doctype and namespace 
>declarations), but isn't even XML - some synthetised attributes end with a 
>colon.

This is actually allowed, it's only the Namespaces in XML Recommendation
that considers this malformed. You may be able to turn namespace support
off in your parser and strip the attributes, or ignore them. Further,
you can use the --drop-proprietary-attributes (or whatever is called)
option to drop them (and other attributes). Other than that Tidy has not
so many choices here to produce better-formed XML, it could only strip
the attributes. Perhaps that merits some configuration option though.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Monday, 23 June 2008 16:20:37 UTC