W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Best options for converting Word 2000 => X/HTML => XML?

From: Stuart Hungerford <stuart.hungerford@webone.com.au>
Date: Fri, 24 Mar 2000 11:45:55 -0600
To: html-tidy@w3.org
Message-ID: <OF5B90A32B.165CCC60-ON86256888.0020F756@rfdinc.com>

Hi all,

I've been experimenting with the output of Word 2000, when using the
"export to compact HTML" and "save as web page" features.

What I'd like is to end up with well-formed XML, but the tidy options
I've been using don't always give me what I'd expect.

Tidy makes a heroic effort on the giant mess Word produces, but I need
all attributes to be quoted and no repeated attributes.  For example,
Word
seems to produce a lot of :

        <p class=foo1 ... class=foo2> ... </p>

Which I need as:

        <p class="foo1" class2="foo2"> ... </p>

Has anybody else had any experiences they could share?

Stu
Received on Friday, 24 March 2000 12:47:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT