W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2001

Re: output-xhtml bug

From: Martin-Éric Racine <q-funk@pp.fishpool.fi>
Date: Sun, 15 Jul 2001 07:40:08 -0400 (EDT)
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: html-tidy@w3.org
Message-ID: <Pine.LNX.4.21.0107151351190.28458-400000@hal.pp.fishpool.fi>
On Sun, 15 Jul 2001, Bjoern Hoehrmann wrote:

> * Martin-Éric Racine wrote:
> >Whenever "output-xhtml" is on, every document is stamped as
> >"XHTML Transitional" regardless of what the source was. This
> >means that documents created as XHTML "Strict" or "Frameset"
> >systematically get stamped as "Transitional".
> Could you please provide us with a test case? The Bug entry is at
> https://sourceforge.net/tracker/index.php?func=detail&aid=435769&group_id=27659&atid=390963
> We weren't able to reproduce it.

I beleive somebody later provided the answer to this problem,
after looking at the sources, and was also able to reproduce it.  

I just tried again, with the sample documents I enclosed. What I
get when .tidyrc has "output-xhtml: yes":

On a frame document:
Tidy (vers 4th August 2000) Parsing "frame.html"

frame.html: Doctype given is "-//W3C//DTD XHTML 1.0 Transitional//EN"
frame.html: Document content looks like XHTML 1.0 Frameset
no warnings or errors were found

As you will see for the enclosed sample, the original document
indeed starts off as a Frameset.

Strict documents are correctly guesses as strict and remain so,
but Tidy and the W3C validator seem to have disagreements on a
few things, like whether CITE and P belong inside BLOCKQUOTE
which, as I recall, is the reason Tidy sees Transitional even
when it really is a Strict-compliant document source.

Also, Tidy does not yet know about XHTML Basic; it guesses that
doctype as Strict and overwrites the DTD accordingly. No real
harm done, except when I try to demonstrate that doctype and Tidy
recursively turns everything into Strict, it voids the point of
having that new subset in the first place.


Looking back, the bug appears to systematically affect Framest
and Basic doctypes, while overall validity interpretation bugs
are most probably caused by disagreements about what is an inline
and what is a block element.

(on a side note, I never cease to be amazed at the discrepancies
between what Tidy, W3C validator and WDG validator consider to be
valid content for block elements)

I hope the above will be of help.

Martin-Éric Racine,
Helsinki, Finland.

Received on Monday, 16 July 2001 23:45:24 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:50 UTC