W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Re: extra tags in output

From: P. T. Rourke <ptrourke@mediaone.net>
Date: Fri, 24 Mar 2000 11:45:36 -0600
To: "Peter Levine" <plevine@intraware.com>, <html-tidy@w3.org>
Message-ID: <OFD0B7F859.633A283C-ON8625686C.00649231@rfdinc.com>

It is, after all, called "HTML-TIDY." My guess is that the program assumes
when you ask for XML that you want a *web page* in XML rather than a db for
example; and most web browsers expect the <html>, <head>, <title> and
<body>
tags (and their closing equivalents), so it puts them in.

I could be wrong, of course . . .

I'd use TIDY as my FIRST cleanup step.

PTRourke

> Hi,
>
> When I set output-xml: yes why does the output include <html>, <head>,
> <title> and <body> tags when my original file doesn't include these
> tags?
>
> I'm using tidy as a last cleanup step after stripping those tags from an
> HTML file. The idea is to get my 'almost' XML' file cleaned up by tidy
> before presenting it to an  XML parser.
>
> TIA,
> Pete
>
> Peter Levine
> Senior Software Engineer
> plevine@intraware.com   http://www.intraware.com
> phone: (925) 253-6658   fax: (925) 253-4599
>
> Intraware...Control Your Technology
>
>
Received on Friday, 24 March 2000 13:14:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT