W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2002

Word 2000, images, v: namespace

From: Trotter Hardy <thardy@wm.edu>
Date: Wed, 10 Apr 2002 10:38:21 -0400
To: <html-tidy@w3.org>
Message-ID: <HLEIJDGKEFLBEPJNEOHHIEGCDPAA.thardy@wm.edu>
I must be doing something wrong.

I'm Trying to TIDY a Word 2000 file (saved "as web page"). Need TIDY to
output XML format. TIDY appears to get rid of a huge amount of Word 2K junk,
but not references to graphic images. These are left in the XML output file
with a namespace prefix of "v:". But TIDY doesn't put any namespace
declaration in the HTML tag, so the resulting XML file is not valid XML. I
have to go back and manually change the <html> tag into
<html xmlns:v="urn:schemas-microsoft-com:vml"> and then that works.

As I want to set up some batch processing routines to run automatically, I'd
really like to find some way that TIDY will output valid XML when asked to
output XML from a Word 2K document.

Am I just missing something simple?

Trotter Hardy
Received on Wednesday, 10 April 2002 10:38:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:52 GMT