W3C home > Mailing lists > Public > www-validator@w3.org > June 2001

RE: Flagging & in URL in HTML 4.01 transitional type.

From: Peter Foti (PeterF) <PeterF@SystolicNetworks.com>
Date: Fri, 8 Jun 2001 16:01:08 -0400
Message-ID: <A10A983C9DFBD4119F0300104B2EA6B7085CE5@ZIPPY>
To: "'www-validator@w3.org'" <www-validator@w3.org>
> > > my opinion is that
> > > this is an artificially created tempest in a teapot, 
> created by the
> > > failure of the validation suite writer to provide a 
> > > "pedantic" mode. Or
> > > the failure of the specification writers to create an 
> > > exception for this
> > > in the transitional type.
> > 
> > Your opinion is flawed.
> You make that statement without supporting it.

I supported it by explaining why the & must be escaped.  There should
not be any exception to this, otherwise it is invalid (per the DTD).

> > > If browsers didn't accept this construct, 98% of the web 
> > > would break. A
> > > significant portion of the web would break for the 
> forseeable future
> > > as well, so it is not a simple question of coalescing 
> support to move
> > > in the direction of compliance.
> > 
> > 98% eh?  That's quite a bit of invalid code floating around 
> then, isn't
> > it?
> Yes, which is why I would say that it is not really invalid, just
> not compliant with the strict HTML 4 specification.

Bad choice of wording on my part.  To clarify, "that's quite a bit of
invalid HTML code based on the DTD."  An HTML document does not need to
follow the rules unless you want it to validate.  While it SHOULD
conform to the set recommendations, there is nothing that says I can't
create HTML pages with my own tags for that matter.  However, not every
browser need support my tags, just as not every browser need support
invalid HTML (like enescaped ampersands).  The only thing I can suggest
to you would be to create your own DTD in which the ampersand was not
invalid as a standalone item.  But like it or not, it is invalid in
existing DTD's.

> > I think your guess is extremely high (and wrong) and that you have
> > no data to support your theory.
> A recent study cited by CNET.com stated that over 50% of web traffic
> was concentrated on 4 sites (Yahoo, AOL, Microsoft, and 
> CNET).  All but
> Microsoft has unescaped & characters in their HTML parameters directly
> on their home page; all have links on the home page that lead to pages
> that have the unescaped character. My extensive experience 
> with thousands
> of other web sites on a programming level suggests that the 
> vast, vast,
> majority is no different.

Note also that those sites to not claim to use valid HTML.  There is no
DOCTYPE declaration in the pages on these site, suggesting that they do
not care about creating valid HTML documents.  Sad but true.  Some of
these sites also use attributes like TOPMARGIN="0" LEFTMARGIN="0"
MARGINWIDTH="0" MARGINHEIGHT="0".  These are also invalid and widely
used.  Would you then argue that these should be allowed as well, just
because so many people use them?

> > > In my opintion that validation is pedantic, and should 
> certainly not
> > > be flagged in the HTML 4.01 transitional type.
> > 
> > In my opinion, you should maybe do some more homework on the topic.
> > 
> I have done plenty of homework. You, on the other hand, have not shown
> me much more than a "Mary, Mary quite contrary" imitation.

Well, sorry you feel that way.  I've explained to you why it's invalid
(which you claim to already have understood).  If you can't grasp that
concept, then there is nothing more I can offer.  It's invalid... accept
it and move on!


> -- 
> Red Hat, Inc., 3005 Nichols Rd., Hamilton, OH  45013
> phone +1.513.523.7621      <mheins@redhat.com>
> Research is what I'm doing when I don't know what I'm doing.
> -- Wernher Von Braun
Received on Friday, 8 June 2001 15:56:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:30 UTC