W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2003

The Evilness that is WordXP

From: Jason Manaigre <jmanaigre@iisd.ca>
Date: Tue, 1 Apr 2003 15:02:24 -0600
Message-ID: <7028701EB59F57489FBE7FD1519D97C0D72859@electron.iisd.ca>
To: <html-tidy@w3.org>


Hi guys, man I know this has been covered a few times here... But now
I'm in a jam...

I normally get very dirty word files to convert into web form...

Now the process I use is as follows:

-Receive the word file

-Save word file as webpage (html filtered)

-I load up http://www.textism.com/resources/cleanwordhtml/ (this site
has been a god send) and I clean my file of word's horrid xml or smart
tags, whatever is slaps in there...

-I then copy and paste this file into homesite and run tidy on it.

This is a serious time saver, and up to about a week ago worked
flawlessly...

Now it would seem, I'm loosing ' before my characters, such as 'Jay's
wild design', is coming out as 'Jays wild design'

And is creating havoc! Now it would seem I can't trust my system to make
accurate web pages from the doc files.

I noticed it's also truncating paragraphs with a > on the end...

So I'd get 'This is a sentence in a websi>' as apposed to the whole
paragraph of 'This is a sentence in a website about wildness'



So after blabbering on about how I do my stuff here, my question is, is
there another way to get wordXP files converted to web pages accurately
and quickly?

What is everyone else using? For me to do these pages by hand, I'd never
have anough time... Preserving formating like bolds and bullets and
links is a serious time saver.

Any help appreciated guys...
Received on Tuesday, 1 April 2003 16:41:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:54 GMT