RE: tidy problems on www.altavista.com

you are, of course, right. however, i need to parse files as the regular
browser parses them, that is, as the writer of the page intended on them to
be parsed. more than that, i think tidy is wrong here. td is an inline tag,
while form is a block, so <td> followed by <form> should be converted to
<td></td><form>. the same thing goes to the /form.

ittay

-----Original Message-----
From: Alexander Biron [mailto:biron@ifh.de]
Sent: Thursday, June 22, 2000 11:11 AM
To: Ittay Freiman
Cc: 'html-tidy@w3.org'
Subject: Re: tidy problems on www.altavista.com


Hi Ittay,

On Thu, 22 Jun 2000, Ittay Freiman wrote:

> i'm having trouble parsing this page, and it seems that so does tidy.
>  
[...]
>  
> well, the trouble here that this output, while legal, isn't a functional
> page (you can't search with it).
>  
> so, what do you think?

Commercial sites have other intentions than tidy or most other parsers:

Commercial sites want a compromise between maximum usability and maximum
desired features. They want to gain as much out of the present browser 
situation as possible by finetuning syntaxes etc. If some syntax is not
standards-compliant, they don't care as long as it does not reduce the
number of their users (customers). 
I.e. the browsers that virtually all their users browse 
with (NN4+, IE4+) have to understand the syntax as intended. (The
number of customers that boykott pages with illegel HTML syntax is
negligible). Some pages (e.g. some Arts pages) set a pretty high
priority on "nice" features, others (e.g. Yahoo) set a pretty high
priority on maximum usability. I.e. the compromise looks different for
each site. All in all, one might say they want their page's syntax to be
compliant to a "maximum revenue HTML" standard.

Tidy on the other hand wants a consistent standard. So it wants the
syntax to be compliant to HTML standards set by the W3C. 


Another point to have in mind is the following: Large commercial sites
like altavista are managed very differently than your good old private
homepage. While you may have your html files sitting on some server,
commercial sites tend to have databases. For them some script/program
rebuilds HTML pages either on demand or in regular intervals. They only
need to adjust the few scripts to comply to some different HTML syntax
and that's it. So altavista simly tells a few of it's
programmers: "Please rearange the ordering of <table> and <form> tags to
our new standard" and they are done with fractions of their resources in
a few hours. An individual webmaster using tidy to help him cope with
his files would have to spend much more time of his full resources
to readjust his pages' HTML likewise. He therefore might prefer stable
solutions where he can say "Netscape 6.0 is out - no problem".

-- 
Cheers alex          Alexander Biron

Support the ban of Dihydrogen Monoxide: http://www.dhmo.org/

work:	http://www.ifh.de/~biron/	private:
	Tel (+49)33762-77-483   	Tel(+49)30-4948857
	mailto:biron@ifh.de    		mailto:biron@frohnau-flamingos.de

Received on Thursday, 22 June 2000 04:15:09 UTC