- From: (wrong string) ærnsrød <steinar@manamind.com>
- Date: Fri, 28 Dec 2001 15:43:09 -0500 (EST)
- To: "Sherman Mohler" <smohler@ciscolearning.org>
- Cc: <html-tidy@w3.org>
----- Original Message ----- From: "Sherman Mohler" <smohler@ciscolearning.org> To: "Bjoern Hoehrmann" <derhoermi@gmx.net> Cc: <html-tidy@w3.org> Sent: Friday, December 28, 2001 7:56 PM Subject: Re: images removed from Word-200 documents? | ...... | 1) Allow the user to "mark up" their documents with some instructions to the | delivery engine (ala a simple meta-language) | 2) User outputs to HMTL from Word or Powerpoint | 3) We covert the meta-language to special HTML markup tags (for later use | with tidy's "new tag" capability) | 4) Use tidy to scrub the HTML output, along with the "new tags". | 5) Grab chunks of HTML, using the "new tags" as markers. | 6) Load the chunks of HTML into the appropriate XML for input into the | delivery engine | | Six easy steps to success! Right? :-0 | Just a few comments to step #1 to #3 in your suggested workflow: If your users are willing to use some constructed tag-set to markup their Word documents, you might instead consider this alternative approach: - Create a set of Word paragraph styles which represents the structure of your documents - Save the styles to a Word .dot file which you tell your users to install in their Office installation - Tell them how to use your styleset and Word's own embedded styles to markup the structure of their documents - Tell them to save the documents to RTF, NOT to HTML! - Get a copy of the excellent rtf2html tool (for instance http://www.logictran.com/) and configure it to output the HTML you want - Proceed with your step 4 if you want to clean the HTML code any further Some benefits with this approach: + the RTF creation inside Word is pretty stable and good across versions and platforms, HTML creation _is not_ + the HTML creation is done centrally by you, not by the users + you get far better control over all parts of the documemts, including the images Possible downsides: + users must install your styles + Word paragraph styles are simple, and not capable of replacing/mapping advanced tag-sets | | Again, many thanks, and kudos for a great tool! | | -- Sherman Mohler, E-Learning Systems Architect | Cisco Learning Institute | | | | Bjoern Hoehrmann wrote: | | > * Sherman Mohler wrote: | > >This is probably a "newbie" question, but I discovered tidy while trying | > >to figre out how to clean up HTML output from Word-200. Tidy does a | > >great job, except for the fact that the links to images are completely | > >removed. Am I doing something wrong, or is there an option I haven't set | > >properly? | > | > Using what version of HTML Tidy? Could you please send sample code? | > -- | > Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de | > am Badedeich 7 } Telefon: +49(0)4667/981028 http://bjoern.hoehrmann.de | > 25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/ | | -- | | Sherman Mohler, E-Learning Systems Architect | Cisco Learning Institute | 2375 East Camelback Road, Suite 220 | Phoenix, Az 85016-3417 | | "I see children cry, and I watch them grow | they'll learn more than I'll ever know | and I think to myself, what a wonderful world..." | - Louis Armstrong | | | -- Steinar Kjærnsrød <steinar@manamind.com> Manamind AS http://priv.infostream.no/~steinar/
Received on Monday, 31 December 2001 16:16:49 UTC