- From: Christian Smith <csmith@barebones.com>
- Date: Thu, 13 Apr 2000 23:41:19 -0400
- To: TimP <tim@paneris.co.uk>
- cc: W3C Validator <www-validator@w3.org>
On Friday, April 14, 2000 at 01:49, tim@paneris.co.uk (TimP) wrote: > Thankyou, I know that that is a proposed solution, but I think it is > very ugly. > > I am trying to get an answer to a deeper point. > > The definition of CDATA within SGML depends upon whether it is used in > the context of an element content definition or an attribute definition. > This 'asymetry'[1] has let us into the position where we have to encode > VALID urls within HTML. > > I want to understand why this asymetry exists and why it is tolerated. I > really like SGML, and have used it successfully in a few projects, > (though I would not claim to know it in detail), but I cannot persuade > my collegues of its benefits whilst it forces the requirement to encode > URLs upon them. I don't know where you picked up this "asymetry" idea or what exactly you mean by this but let me try to cover some points here. This is a valid URI http://www.company.com/cgi-bin/search?foo&bar Now, the definition for the HREF attribute of the A entity states that it is CDATA. The definition of CDATA has a number of items and one of these is that an & MUST be encoded as & (or its numeric equivalent). The HTML spec also notes in a comment that an HREF takes as URI as it's value. But, because the content of an HREF must be CDATA you needs must html entity encode certain characters if they appear in the URI. Lets look at some examples. Example 1: bad - <a href="http://www.company.com/search?foo/bar"> In the above example the content of the href is CDATA but it is NOT a valid URI because we have a / which is not being used in its reserved location and which therefor needs to be URI encoded. good - <a href="http://www.company.com/search?foo%2Fbar"> Example 2: bad - <a href="http://www.company.com/search?foo&bar"> In the above example the content of the href is a valid URI but it is NOT CDATA because we have an & which is not HTML encoded. good - <a href="http://www.company.com/search?foo&bar"> Perhaps this sheds some light on your confusion. I hope so. -- Christian Smith | csmith@barebones.com | http://web.barebones.com PGP Fingerprint - 60E5 2216 97D2 1D1A B923 F036 00A9 CEC0 D411 FA89
Received on Thursday, 13 April 2000 23:40:55 UTC