Re: [Moderator Action] Tidy from Dave Raggett on 2000-03-24 (html-tidy@w3.org from January to March 2000)

From: Dave Raggett <dsr@w3.org>
Date: Fri, 24 Mar 2000 11:45:37 -0600
To: David Goudie <david.goudie@bluewin.ch>
Cc: html-tidy@w3.org
Message-ID: <OF70CAEBDA.7B03755F-ON8625686D.005836E0@rfdinc.com>

On Tue, 18 Jan 2000, David Goudie wrote:

> I have been attempting to 'decode' a large microsoft doc file
> and fortunately have been able to duplicate the problem in a
> small test file.
>
> While I use xml and xsl as the basis for my work writing
> requirements specifications I do not have a strong insight into
> all aspects of these concepts. However, it seems to me that the
> <o:p> - which usually appears in <o:p></o:p> pairs - should be
> passed by Tidy. As always with Microsoft data files the
> percentage of 'junk' is large!

I just tried your test file and discovered that when you save
the file to "Web page" (not using the HTML export filter), all
of the <o:p> elements are empty. I am therefore curious as to why
you want these preserved. For instance, at the end of the file
there is:

<p class=MsoNormal><b><span lang=EN-AU><![if<![endif]><o:p></o:p></span></b></p>Which is essentially an empty paragraph and a pretty poor way to
add vertical whitespace.

Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
tel/fax: +44 122 578 3011 (or 2521) +44 385 320 444 (mobile)
World Wide Web Consortium (on assignment from HP Labs)

Received on Friday, 24 March 2000 13:14:41 UTC