- From: Mike Heins <mheins@redhat.com>
- Date: Fri, 8 Jun 2001 15:26:07 -0400 (EDT)
- To: www-validator@w3.org
- Cc: "Peter Foti (PeterF)" <PeterF@SystolicNetworks.com>
Quoting Peter Foti (PeterF) (PeterF@SystolicNetworks.com): > > Since every browser in the world must tolerate &, > > > No, actually most do tolerate the & without encoding, however, it is > foolish to say that every browser MUST tolerate it. There is a reason > why the & must be escaped, you just aren't bothering to ask why. >From a pedantic technical perspective, I *understand* why. Every browser *must* tolerate it if it is to have a chance of being usable for the next 10 years, OK. If the browser has no intention of being usable for anything but HTML4-compliant sites, a percentage that will struggle to achieve double digits in the next five years, OK. > > Consider this: certain characters (like < or > ) must be escaped so > that the browser knows that it is not part of the HTML code. For > example, if I want my page to display: > > 0 < 1 & 2 > 1 I know all of the technical reasons. > > then the browser needs to know that this is not part of an HTML tag. So > the special characters need to be escaped. So < becomes < and > > becomes > > But now we have created a new special character that the browser has to > look for... the & signifies the beginning of an escaped sequence now. > So therefore, whenever a browser sees an & it needs to see if there's an > entity that represents a special character. Therefore, to display an &, > we escape it with & > > > > my opinion is that > > this is an artificially created tempest in a teapot, created by the > > failure of the validation suite writer to provide a > > "pedantic" mode. Or > > the failure of the specification writers to create an > > exception for this > > in the transitional type. > > Your opinion is flawed. You make that statement without supporting it. > > > > > If browsers didn't accept this construct, 98% of the web > > would break. A > > significant portion of the web would break for the forseeable future > > as well, so it is not a simple question of coalescing support to move > > in the direction of compliance. > > 98% eh? That's quite a bit of invalid code floating around then, isn't > it? Yes, which is why I would say that it is not really invalid, just not compliant with the strict HTML 4 specification. > I think your guess is extremely high (and wrong) and that you have > no data to support your theory. A recent study cited by CNET.com stated that over 50% of web traffic was concentrated on 4 sites (Yahoo, AOL, Microsoft, and CNET). All but Microsoft has unescaped & characters in their HTML parameters directly on their home page; all have links on the home page that lead to pages that have the unescaped character. My extensive experience with thousands of other web sites on a programming level suggests that the vast, vast, majority is no different. > However, the fix for this would of > course be for lazy web page designers to do it right the first time and > use & instead of &. Fortunately (or unfortunately, if you want more > standards based, clean code to be developed) most browsers will simply > understand that a standalone & without any known escaped character > sequence following it, is just an ampersand, and they will display it as > such. > > > > > In my opintion that validation is pedantic, and should certainly not > > be flagged in the HTML 4.01 transitional type. > > In my opinion, you should maybe do some more homework on the topic. > I have done plenty of homework. You, on the other hand, have not shown me much more than a "Mary, Mary quite contrary" imitation. -- Red Hat, Inc., 3005 Nichols Rd., Hamilton, OH 45013 phone +1.513.523.7621 <mheins@redhat.com> Research is what I'm doing when I don't know what I'm doing. -- Wernher Von Braun
Received on Monday, 11 June 2001 03:29:52 UTC