RE: Tidy's handling of <noscript>

This is a known feature and problem of Tidy.  This basic technique of
inserting parent end tags when an illegal child begin tag is encountered
solves the bulk of markup problems in an efficient and elegant manner.

It doesn't always do what you'd like, however.  In particular, it doesn't
handle overlapping inline tags the way browsers do:

E.g.
The <i>quick <b>brown</i> fox</b> jumped ...

becomes:
The <i>quick</i> <b>brown fox</b> jumped ...

rather than (what most browsers do, in effect):
The <i>quick <b>brown</b></i> <b>fox</b> jumped ...

Thanks to Björn and Terry, the problem was solved for headers <h1><h2>...
and anchor tags <a>, both of which get special treatment.  But it still pops
up in unexpected places like this one.

Unfortunately, the solution requires knowledge about HTML content models,
which is pretty good but not comprehensive within Tidy - and has been the
subject of several discussions on the development list.

take it easy,
Charlie

-----Original Message-----
From: Valen [mailto:valen@nic.com]
Sent: Monday, August 20, 2001 9:45 AM
To: html-tidy@w3.org
Cc: Fred Bone
Subject: Re: Tidy's handling of <noscript>


Hi All,

Thanks for the several replies.

It is not my intent to disparage Tidy; it is truly a swell piece of work.  I
am simply inquiring about a particular behavior.  I guess I had in mind an
alternate approach.  It's appropriate that <noscript> or any other element
that ought not occur in <head> be placed in <body>. But why should the
head-licit <script> tag suffer the same fate? To put it differently, why not
process all <head> tags, moving to <body> those that are illicit while
leaving in place those that are licit?

Cordially,

Paul

-- Original Message --
From: Fred Bone <Fred.Bone@dial.pipex.com>
To: html-tidy@w3.org
Send: 2001-08-19
Subject: Re: Tidy's handling of <noscript>

On 18 Aug 2001, at 19:23, Paul wrote: 

> My HTML contains a sequence like this:
> 
> <html>
> <head>
> 
> <script>
> <!--
>   some script statements here
> // stop hiding -->
> </script>
> 
> <noscript>
> show this stuff when scripting not enabled
> </noscript>
> 
> </head>
> 
> <body> ....
> 
> Tidy's output moves the <noscript> block to within <body>;
specifically as
> the very first thing after <body>.

Seems the best place for it. Where were you expecting it to go? 

What it's actually doing is: 1. finding <noscript> outside of <body>

2. inserting </head><body> so <noscript> is in the right place

3. deleting the now-superfluous <body> you supplied 


> When I move the <noscript> block before the <script> block and
tidy again,
> the tidy output now has BOTH
> the <noscript> and the <script> block within <body>;
neither is in <head>
> any longer.
> 
> Can anyone shed some light on this behavior? -v shows 30th April,
2000

The explanation above covers this case too: Tidy isn't "moving" 
anything, it's inserting the missing <body> prior to <noscript>.









___________________________________
NOCC, http://nocc.sourceforge.net

Received on Monday, 20 August 2001 14:24:34 UTC