W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2002

Re: XML closing tags -Automatic creation?

From: Charles Reitzel <creitzel@rcn.com>
Date: Mon, 16 Dec 2002 12:03:36 -0500
Message-Id: <4.3.2.7.2.20021216114728.02d650f8@pop.rcn.com>
To: Matthew Redden <matthewredden@yahoo.co.uk>
Cc: HTML-tidy@w3.org

Hi Matthew,

You don't say if your tags are HTML or your own.  If they are HTML, the 
answer is probably yes.  But I gather these are custom tags, because you 
would have just run Tidy.  You may be able to get some benefit by declaring 
these tags:

http://tidy.sourceforge.net/docs/quickref.html#new-blocklevel-tags
http://tidy.sourceforge.net/docs/quickref.html#new-empty-tags
http://tidy.sourceforge.net/docs/quickref.html#new-inline-tags
http://tidy.sourceforge.net/docs/quickref.html#new-pre-tags

But I doubt it will be able to do significant cleanup for you.  Inferring 
the end of a tag with mixed or block content model is tough.  See 
http://tidy.sf.net/bug/443678 and http://tidy.sf.net/bug/630990

Basically, when a new tag is encountered, you need to figure out if it is 
OK in the current context.  If not, end the current one and start a new 
one.  With nested content, you need to kick the new node "up" until it 
finds a home - if any.  This requires a knowledge of what is or isn't a 
valid child node (either tag or text) of any given element.

If you want to hack Tidy to do this, you'll need to add some entries to the 
array of tag definitions in tags.c (also to TidyTagId enum in tidyenum.h).

<vent>
After much exposure, I don't think the SGML/XML notion of nested tags is a 
good one.  It's just too clumsy and wordy.  Simple brackets - of whatever 
kind you like - are much better.

HTML
{
   HEAD
   {
     SCRIPT src = "foo.js" type = "text/javascript";
   }
}

I think that this would take a day to get used to and would result in a 
kinder, gentler internet.  Anyway, that's an idea I have sometimes after 
extended bouts of wrestling with pointy brackets.
</vent>

hth,
Charlie



At 02:09 AM 12/15/2002 +0000, Matthew Redden wrote:

>Dear Readers,
>
>I have had a quick look through the documentation of
>HTML tidy.  However, I am not sure if HTML tidy can do
>what I require.  I have an XML like document indented
>according to node position.  There are nested groups
>of nodes in the document.  However, there are no
>closing tags in this document.  Can anyone tell me if
>there is a way that HTML tidy can place correctly
>positioned closing tags into this document.
>
>Also.  I tried to install HTML Tidy onto a Linux
>platform but the unpacked package did not respond.  I
>used the gunzip -zxvf command for unpacking the .tgz
>file.  Maybe this is not correct.
>
>Thanks in advance for your time.
>
>Matt
Received on Monday, 16 December 2002 11:59:24 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 23:39:48 UTC