- From: Barry McMullin <mcmullin@eeng.dcu.ie>
- Date: Wed, 28 May 2003 09:34:50 +0100 (IST)
- To: w3c-wai-ig@w3.org
- cc: mcmullin@eeng.dcu.ie
On Tue, 27 May 2003, Matthew Smith wrote: > Is anyone aware of a tool (preferably something that will run under a > Unix-ish operating system) that can take the HTML created by Microsoft > Word and turn it into clean, Accessible XHTML? I generally ignore the native MS-HTML, but use wvWare on linux to work directly on the original doc format file: http://www.wvware.com/ I believe its specific translation behaviour is highly configurable; however I usually also run the output through tidy and some perl script to remove anything I really don't want (generally pure presentational markup). tidy can also yield xhtml of course. This chain works OK on simple documents, and may well work in your application. Of course, it'll work best if authors use appropriate Word style markup (headings etc.). A local company here in Dublin doing nice work in this area ia XML workshop (no, this is not a paid announcement!); they do their own tools for this purpose, but also maintain a list of tools available elsewhere: http://www.xmlw.ie/aboutxml/word2xml.htm They also have a recent discussion of the XML support in Word 2003: http://www.xmlw.ie/aboutxml/word2003.htm (Mind you, even though the company offers "accessibility" consultancy, I would not suggest that their own site is a model of best practice; it certainly appears to use a fixed width multi-column format, at least as viewed in opera 7...). Best, - Barry. -- Barry McMullin http://www.eeng.dcu.ie/~mcmullin/
Received on Wednesday, 28 May 2003 04:34:53 UTC