- From: Huw Wyn Jones <huw@pioden.net>
- Date: Mon, 11 Apr 2005 12:44:20 +0100
- To: html-tidy@w3.org
** This is a repost to see if I get a response !! :) ** Hi everyone, I have Tidy set-up to 'tidy' HTML inputed by clients. More often that not the clients paste in 'HTML' generated by Word - which I'm trying to strip the junk out of. Example below: <P class=MsoNormal style=\"MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 36pt\"><B style=\"mso-bidi-font-weight: normal\"><SPAN lang=EN-GB style=\"FONT-SIZE: 11pt; COLOR: #006600; FONT-FAMILY: \'Trebuchet MS\'; mso-bidi-font-size: 10.0pt\">MENTER IAITH ABERTAWE<?xml:namespace prefix = o ns = \"urn:schemas-microsoft-com:office:office\" /><o:p></o:p></SPAN></B></P> <P class=MsoNormal style=\"MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 36pt\"><SPAN lang=EN-GB style=\"FONT-SIZE: 11pt; FONT-FAMILY: \'Trebuchet MS\'; mso-bidi-font-weight: bold; mso-bidi-font-size: 10.0pt\">SIWAN THOMAS, Field Officer<o:p></o:p></SPAN></P> My Tidy config file is as follows: bare: yes clean: yes doctype: omit drop-empty-paras: yes drop-proprietary-attributes: yes enclose-text: yes escape-cdata: yes fix-backslash: yes join-styles: yes logical-emphasis: yes lower-literals: yes output-xhtml: yes show-body-only: yes word-2000: yes indent: yes output-encoding: utf8 force-output: yes quiet: yes write-back: yes Is there anything else I can add to strip out the Word cr*p ? I thought that Tidy would have a greater impact than it's having now :( TIA Huw -- =============================== Huw Wyn Jones Cyfarwyddwr Technegol Pioden Rhyngweithiol 106-108 Stryd Fawr Bangor Gwynedd LL57 1NS Ffon: 01248 364970 neu 01248 354626 E-bost: huw@pioden.net WWW: http://www.pioden.net ===============================
Received on Monday, 11 April 2005 11:44:58 UTC