Re: [Moderator Action] Tidy from Dave Raggett on 2000-01-21 (html-tidy@w3.org from January to March 2000)

From: Dave Raggett <dsr@w3.org>
Date: Fri, 21 Jan 2000 15:56:11 +0000 (GMT Standard Time)
To: David Goudie <david.goudie@bluewin.ch>
cc: html-tidy@w3.org
Message-ID: <Pine.WNT.4.10.10001211548580.300-100000@OEMCOMPUTER>

On Tue, 18 Jan 2000, David Goudie wrote:

> I have been attempting to 'decode' a large microsoft doc file
> and fortunately have been able to duplicate the problem in a
> small test file.
> 
> While I use xml and xsl as the basis for my work writing
> requirements specifications I do not have a strong insight into
> all aspects of these concepts. However, it seems to me that the
> <o:p> - which usually appears in <o:p></o:p> pairs - should be
> passed by Tidy. As always with Microsoft data files the
> percentage of 'junk' is large!

I just tried your test file and discovered that when you save
the file to "Web page" (not using the HTML export filter), all
of the <o:p> elements are empty. I am therefore curious as to why
you want these preserved. For instance, at the end of the file
there is:

<p class=MsoNormal><b><span lang=EN-AU><![if
!supportEmptyParas]>&nbsp;<![endif]><o:p></o:p></span></b></p>

Which is essentially an empty paragraph and a pretty poor way to
add vertical whitespace.

Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
tel/fax: +44 122 578 3011 (or 2521) +44 385 320 444 (mobile)
World Wide Web Consortium (on assignment from HP Labs)

Received on Friday, 21 January 2000 10:56:15 UTC