W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Best options for converting Word 2000 => X/HTML => XML?

From: Stuart Hungerford <stuart.hungerford@webone.com.au>
Date: Thu, 17 Feb 2000 17:00:45 +1100
Message-ID: <38AB8E8D.9BCE816D@webone.com.au>
To: html-tidy@w3.org
Hi all,

I've been experimenting with the output of Word 2000, when using the
"export to compact HTML" and "save as web page" features.

What I'd like is to end up with well-formed XML, but the tidy options
I've been using don't always give me what I'd expect.

Tidy makes a heroic effort on the giant mess Word produces, but I need
all attributes to be quoted and no repeated attributes.  For example,
seems to produce a lot of :

        <p class=foo1 ... class=foo2> ... </p>

Which I need as:

        <p class="foo1" class2="foo2"> ... </p>

Has anybody else had any experiences they could share?

Received on Thursday, 17 February 2000 00:59:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:47 UTC