Re: Tidy and Tidy's page

From: Terry Teague <terry_teague@users.sourceforge.net>
Date: Sat, 15 Sep 2001 12:18:18 -0700
Message-Id: <l03130300b7c956fc693b@[]>
To: thecroll@mail.ru
Cc: html-tidy@w3.org
At 8:36 PM +0400 9/10/01, TheCroll [mail.ru] wrote:
>Dear Mr. Raggett!

I'm not Dave, but he forwarded your EMail to the html-tidy mailing list,
and I will respond appropriately.

>There's a sentence saying:
>"Microsoft has developed its own optional filter for exporting to HTML,
>and the 2.0 version is much improved. You can download the filter free
>from the Microsoft Office Update site."
>And the link pointing to "MS Office Update site" points to
>http://officeupdate.microsoft.com/2000/downloadDetails/Msohtmf2.htm, but
>when you get there the server says the page has moved to
>So why don't you change the link to the appropriate one?

Like many other things, the Tidy web page hasn't been updated in over a
year, so I'm not surprized some links are out of date. Since the MS page
provides a redirect, I don't see that as a big problem at the moment. But
thanks for pointing it out, and I will endeavour to have future links to
the MS page, updated.

>Also, I would like to say that Tidy is a pretty good program, but IMHO you
>don't need to include the large support for slide-creating, because Tidy
>is intended to tidy HTML, not to generate slides. There are a lot of
>programs for people who want slides. But that's just my opinion.

Actually the slide support is only a very small part of the program, and
there have been numerous requests for additional features. But I don't
disagree with your opinion.

>And the last. I have discovered that tidy changes "" sign (&mdash; or
>&#8212; - em dash, U+2014 ISOpub) to simple "-" (&minus; or &#8722; -
>minus sign, U+2212 ISOtech). I cannot change "" to &mdash; in the source
>code, because of the problems in popular browsers, but I still need the
>long dash. Could you fix this?

This was deliberate, as you say the entities are not well supported in some
versions of the browsers. You didn't have a choice (other than not using
the "-clean" option) in the 04 Aug 00 version of Tidy.

But with the current version of Tidy (soon to be released), you can control
this behaviour with the config option "--ascii-chars no".

>And will you please add the support for codepage-1251 (cyrillic windoze)?
>I suffer from this because I need to publish my docs in windows-1251, but
>tidy accepts utf-8, so I need to convert my docs in order to get
>appropriate results. Of course I would migrate to UTF-8 if my hosting will
>ever let me.
>Can I help with this anyhow?

At this stage, we don't plan to add support for additional encodings
(although we did add explicit support for Windows-1252 this time round),
but a future version of Tidy is likely to have a more general (plugin)
mechanism for specifying input and output encodings.

If you wish to help with Tidy, I suggest you visit the following site, and
if you are able to, get involved with the development/testing/documentation
of the next and future versions of Tidy.


Hope this helps.

Regards, Terry Teague
