RE: how could Chinese, etc. "look like 3.2?"

Hi Dan,

Point taken about the character encoding and HTML version.  In principle,
the Tidy config might not have an impact, but in practice, it may well.

I would assume that Tidy is _not_ applying this logic and is regressing the
HTML version due to some other feature of the document.  Are there other
warnings?  I assume that this build has Asian encodings enabled (#define
SUPPORT_ASIAN_ENCODINGS 1) or are you using -raw?  

Btw, can we use this document as a test case for Big5 encoding?

take it easy,
Charlie

-----Original Message-----
From: Dan Jacobson [mailto:jidanni@yam.com.tw]
Sent: Monday, December 03, 2001 1:08 AM
To: html-tidy@w3.org
Subject: how could Chinese, etc. "look like 3.2?"


What I'm saying is that before HTML 4, any file with Chinese,
etc. characters in it is illegal and will not pass the validator,
e.g. http://www.htmlhelp.com/tools/validator , so how could tidy say
"looks like 3.2"?

For example take http://www.geocities.com/jidanni/foreigner.html
$ tidy01nov01 foreigner.html > /dev/null 

HTML Tidy for Linux/x86 (vers 1st November 2001; built on Nov  9 2001, at
00:06:22)
Parsing "foreigner.html"

foreigner.html: Doctype given is "-//W3C//DTD HTML 4.01//EN"
foreigner.html: Document content looks like HTML 3.2

this has nothing to do with ~/.html-tidy

>>>>> "Bjoern" == Bjoern Hoehrmann <derhoermi@gmx.net> writes:

Bjoern> * Dan Jacobson wrote:
>> stdin: Doctype given is "-//W3C//DTD HTML 4.01//EN"
>> stdin: Document content looks like HTML 3.2
>> 
>> Erm, does a file crammed with big5 Chinese qualify as "looks like HTML
3..2"?

Bjoern> I am sure you understand we cannot help you with this issue without
Bjoern> information on the config options, the HTML file and the Tidy
version
Bjoern> you used to produce this.
-- 
http://www.geocities.com/jidanni/ Tel+886-4-25854780

Received on Monday, 3 December 2001 12:28:36 UTC