W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2003

Re: Tidy kept paragraphs but lost links.

From: Richard A. O'Keefe <ok@cs.otago.ac.nz>
Date: Thu, 5 Jun 2003 15:08:38 +1200 (NZST)
Message-Id: <200306050308.h5538cxY261773@atlas.otago.ac.nz>
To: giovanni.gaio@bcb.gov.br, html-tidy@w3.org

"Giovanni Luiz Gaio" <giovanni.gaio@bcb.gov.br> wrote:
	Tidy is really wonderfull, but this time it isn't doing right.

WRONG.

The example involves
    <A ...><P>...</P></A>

But that is not legal HTML and never ever has been.

To a first approximation, HTML elements are either "block" elements
or "inline" elements.  Blocks are things like paragraphs, lists, tables,
headings.  Inline elements are things like images, text styles, and
cross references.  One important reason for being aware of the
distinction is that INLINE ELEMENTS MAY ONLY CONTAIN OTHER INLINE
ELEMENTS, NEVER BLOCKS.

"A" elements are inline elements.
"P" elements are block elements.
"A" elements MAY NOT contain "P" elements.
Ever.
Never could.

In my view, the best thing for Tidy to do would be to convert
<A x><P>y</P></A> into <P><A x>y</A></P>,
but <A x></A><P>y</P> is pretty good.
	
	***As you see, Tidy takes the paragraph out of the anchor.

Yes.  It *MUST*, if the result is to have even the slightest hope
of being even approximately legal HTML.

	It should be kept inside to preserve the link!***
	
That's why simply turning it inside out would be better.

But you can't expect Tidy to worry too much about "preserving"
what never was a legal link in the first place.  Did Tidy tell
you about the problem it was fixing?  You should expect _some_
manual cleanup after Tidy has finished, unless the source page
was almost reasonable to begin with.
Received on Wednesday, 4 June 2003 23:08:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:54 GMT