W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Best options for converting Word 2000 => X/HTML => XML?

From: Stuart Hungerford <stuart.hungerford@webone.com.au>
Date: Thu, 17 Feb 2000 17:00:45 +1100
Message-ID: <38AB8E8D.9BCE816D@webone.com.au>
To: html-tidy@w3.org
Hi all,

I've been experimenting with the output of Word 2000, when using the
"export to compact HTML" and "save as web page" features.

What I'd like is to end up with well-formed XML, but the tidy options
I've been using don't always give me what I'd expect.

Tidy makes a heroic effort on the giant mess Word produces, but I need
all attributes to be quoted and no repeated attributes.  For example,
Word
seems to produce a lot of :

        <p class=foo1 ... class=foo2> ... </p>

Which I need as:

        <p class="foo1" class2="foo2"> ... </p>

Has anybody else had any experiences they could share?

Stu
Received on Thursday, 17 February 2000 00:59:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT