- From: Vaclav Barta <vbar@comp.cz>
- Date: Mon, 23 Jun 2008 19:07:39 +0200
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: html-tidy@w3.org
On Monday 23 June 2008 18:19:56 Bjoern Hoehrmann wrote: > * Vaclav Barta wrote: > >which obviously not only isn't valid XHTML (and tidy knows that, warns > > about proprietary attributes yet insists on the doctype and namespace > >declarations), but isn't even XML - some synthetised attributes end with a > >colon. > This is actually allowed, it's only the Namespaces in XML Recommendation > that considers this malformed. Well yes, technically, but since XHTML does use namespaces, shouldn't tidy follow the recommendation? > You may be able to turn namespace support > off in your parser and strip the attributes, or ignore them. Further, Not easily - in fact, I don't think I've ever used an XML library with configurable namespace support. I don't doubt they exist, but I don't think they're all that popular - most XML processing is AFAIK namespace-aware these days... > you can use the --drop-proprietary-attributes (or whatever is called) > option to drop them (and other attributes). Other than that Tidy has not Well, I don't really want to drop all proprietary attributes - just the unparseable ones... I think the general problem is that HTML Tidy is meant to produce documents with standartized semantics, while I want just the XML syntax (as an input for capturing site-specific semantics later) from it - maybe I'm using the wrong tool... > so many choices here to produce better-formed XML, it could only strip > the attributes. Perhaps that merits some configuration option though. I think asxml should be a totally different option from asxhtml - while XHTML certainly is XML, tidy seems to assume the opposite as well... Bye Vasek -- http://www.mangrove.cz/ Open Source integration
Received on Monday, 23 June 2008 17:08:18 UTC