Re: [Moderator Action] Tidy from Dave Raggett on 2000-03-24 (html-tidy@w3.org from January to March 2000)

From: Dave Raggett <dsr@w3.org>
Date: Fri, 24 Mar 2000 11:45:37 -0600
To: David Goudie <david.goudie@bluewin.ch>
Cc: html-tidy@w3.org
Message-ID: <OF25519CBA.B0869FB7-ON8625686D.0056F10A@rfdinc.com>

On Tue, 18 Jan 2000, David Goudie wrote:

> I have been attempting to 'decode' a large microsoft doc file
> and fortunately have been able to duplicate the problem in a
> small test file.
>
> While I use xml and xsl as the basis for my work writing
> requirements specifications I do not have a strong insight into
> all aspects of these concepts. However, it seems to me that the
> <o:p> - which usually appears in <o:p></o:p> pairs - should be
> passed by Tidy. As always with Microsoft data files the
> percentage of 'junk' is large!
>
> I would consider the Microsoft files legitimate candidates for
> your program - they display correctly in IE5 - and the detection
> of errors by Tidy seems to be context sensitive. Pragmatically I
> think that Tidy should be able to detect and 'fix' these errors.

You may want to try the newer version of Microsofts HTML export
filter which can be downloaded from the following address:

http://officeupdate.microsoft.com/2000/downloadDetails/Msohtmf2.htm

I will look into adding support for <o:p></o:p> but this may take
me a while due to the pressure of other work. Thanks for sending
me the test file.

Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
tel/fax: +44 122 578 3011 (or 2521) +44 385 320 444 (mobile)
World Wide Web Consortium (on assignment from HP Labs)

Received on Friday, 24 March 2000 13:14:43 UTC