- From: Reitzel, Charlie <CReitzel@arrakisplanet.com>
- Date: Mon, 10 Sep 2001 13:47:04 -0400
- To: "'Matt G'" <mattg@vguild.com>, html-tidy@w3.org
Hi Matt, I don't doubt that there are some good reasons to "tidy" up sloppy XML independently of HTML. I may or may not understand your particular application but, hey, there are lots of ways to skin a cat (poor cat). I have seen buggy web services that occasionally emit bad (not WF) XML - containing otherwise good data. My suggestion would be to submit patches and config options to the tidy-develop list over at Source Forge. If you can't convince the group, you can always apply the patches locally. Also, the W3C license is very flexible. You could use the current source base as the starting point of a pure XML tool. The best course depends on how the tools evolve over time. I'd think an XML tool would want to be schema aware (DTD, XML Schema, Schematron, TREX, RELAX, RDF, etc.), whereas schemas do not capture all of the nuances of HTML that Tidy needs to handle. Finally, I think Paul's question is more about the differences between HTML and XHTML and not about generic XML at all. I believe the answer is that, according to HTML 3.2, headings and paragraphs are both "block level" elements. Block level elements cause "paragraph breaks" and, thus, may not be nested (denoted by the element content model %text in the DTD). take it easy, Charlie -----Original Message----- From: Matt G [mailto:mattg@vguild.com] Sent: Sunday, September 02, 2001 5:48 PM To: Paul; html-tidy@w3.org Subject: Re: xml versus xhtml That was exactly the question I asked a few days ago, though phrased differently. Tidy is designed to fix HTML, not fix XML. I do think there would be some value for an option within Tidy, or a different version of the tool (TidyXML?), that would only fix XML, ignoring HTML compliance. Fixing XML should be infinitely faster, as you can bypass all the logic of what is good HTML. And yes, I too thought that output-xml would accomplish that, but it does not. Matt ----- Original Message ----- From: "Paul" <valen@nic.com> To: <html-tidy@w3.org> Sent: Sunday, September 02, 2001 3:35 PM Subject: xml versus xhtml Hi All, I am beginning to believe that when there is a discrepancy between me and Tidy, that Tidy is usually right. That being said, I'll ask anyway: When I tidy - <h1>start of heading <p>paragraph within</p> end of heading </h1> tidy returns <h1>start of heading</h1> <p>paragraph within</p> <p>end of heading</p> This seems consistent with earlier observed tidy behavior, namely that xhtml1.0 dtd disallows <p> within <h1>. So tidy closes the <h1>, etc. But I didn't specify output-xhtml. I specified output-xml. Isn't the input to tidy valid h1 'xml'? If so, why does tidy seem to force compliance with the xhtml dtd? Thanks for your help. Cordially, Paul
Received on Monday, 10 September 2001 13:46:33 UTC