- From: Charles Reitzel <creitzel@rcn.com>
- Date: Mon, 16 Dec 2002 12:03:36 -0500
- To: Matthew Redden <matthewredden@yahoo.co.uk>
- Cc: HTML-tidy@w3.org
Hi Matthew, You don't say if your tags are HTML or your own. If they are HTML, the answer is probably yes. But I gather these are custom tags, because you would have just run Tidy. You may be able to get some benefit by declaring these tags: http://tidy.sourceforge.net/docs/quickref.html#new-blocklevel-tags http://tidy.sourceforge.net/docs/quickref.html#new-empty-tags http://tidy.sourceforge.net/docs/quickref.html#new-inline-tags http://tidy.sourceforge.net/docs/quickref.html#new-pre-tags But I doubt it will be able to do significant cleanup for you. Inferring the end of a tag with mixed or block content model is tough. See http://tidy.sf.net/bug/443678 and http://tidy.sf.net/bug/630990 Basically, when a new tag is encountered, you need to figure out if it is OK in the current context. If not, end the current one and start a new one. With nested content, you need to kick the new node "up" until it finds a home - if any. This requires a knowledge of what is or isn't a valid child node (either tag or text) of any given element. If you want to hack Tidy to do this, you'll need to add some entries to the array of tag definitions in tags.c (also to TidyTagId enum in tidyenum.h). <vent> After much exposure, I don't think the SGML/XML notion of nested tags is a good one. It's just too clumsy and wordy. Simple brackets - of whatever kind you like - are much better. HTML { HEAD { SCRIPT src = "foo.js" type = "text/javascript"; } } I think that this would take a day to get used to and would result in a kinder, gentler internet. Anyway, that's an idea I have sometimes after extended bouts of wrestling with pointy brackets. </vent> hth, Charlie At 02:09 AM 12/15/2002 +0000, Matthew Redden wrote: >Dear Readers, > >I have had a quick look through the documentation of >HTML tidy. However, I am not sure if HTML tidy can do >what I require. I have an XML like document indented >according to node position. There are nested groups >of nodes in the document. However, there are no >closing tags in this document. Can anyone tell me if >there is a way that HTML tidy can place correctly >positioned closing tags into this document. > >Also. I tried to install HTML Tidy onto a Linux >platform but the unpacked package did not respond. I >used the gunzip -zxvf command for unpacking the .tgz >file. Maybe this is not correct. > >Thanks in advance for your time. > >Matt
Received on Monday, 16 December 2002 11:59:24 UTC