- From: Etan Wexler <ewexler@stickdog.com>
- Date: Sun, 26 May 2002 01:37:12 -0700
- To: Bjoern Hoehrmann <derhoermi@gmx.net>, www-style@w3.org
- Cc: www-html@w3.org
Bjoern Hoehrmann wrote: > No, please see http://www.w3.org/TR/html4/struct/links.html#h-12.2.1 > > [...] > * String matching: Comparisons between fragment identifiers and > anchor names must be done by exact (case-sensitive) match. > [...] I read that section of the specification and could find nothing that legitimately contradicts my point. Perhaps the intent of the authors was otherwise, but taking the prose at face value and respecting the normative status of the reference to ISO 8879 (the SGML specification), I stand by my case. A possibly confusing but crucial point in this discussion is the relationship between attribute values and their attribute value specifications. An attribute value specification, which is what appears in a start tag, may require some processing to derive the attribute value, which exists only in memory in the application. This is roughly comparable to the way a CSS processor derives computed property values from specified property values. For example, character references and entity references in an attribute value specification need replacement. Another form of processing that an SGML parser must do to attribute value specifications for attributes with declared values of 'ID' is the character substitution as specified in the SGML declaration. HTML4's SGML declaration has chosen to enable this character substitution and to replace lower-case English letters with their upper-case equivalents. Thus, HTML4 'ID' attribute value specifications may be in any case or case combination while the resultant values are always and completely in upper case. I quote from HTML4 section 12.2.1: An anchor name is the value of either the name or id attribute when used in the context of anchors. The anchor name is not the attribute value specification, but the attribute value. It is against this string that a fragment identifier must match. The situation is further confused by the 'NAME' attribute. With a declared value of 'CDATA', the attribute value specification of the 'NAME' attribute undergoes no character substitution and may have lower-case letters in its value. If I have an HTML4 element with the start tag <A ID="AnElement">, the correct fragment identifier is "#ANELEMENT". If I have an HTML4 element with the start tag <A NAME="AnElement">, the correct fragment identifier is "#AnElement". One of the impacts of this situation is that the start tag <A ID="AnElement" NAME="AnElement"> is illegal in HTML4 as it produces two anchor names differing only in case. > Maybe you like to refer to > > http://www.w3.org/2002/02/mid/20010706174702X.mimasa@w3.mag.keio.ac.jp > > for additional information on this contradiction in the specification. So far as I can tell, there is no contradiction. Rather, there are some tricky and implicit requirements and results that elude not only beginners but also most of us immersed in HTML. This regrettable state of affairs is probably inevitable when one tries to reconcile a lengthy and rigorous ISO standard with the loose practices of tag soup and the software which eats it. In any case, I reiterate my suggestion that the Working Group either switch to XHTML or give attribute value specifications in upper case. Taking either path will eliminate the ambiguity. -- Etan Wexler <mailto:ewexler@stickdog.com>
Received on Sunday, 26 May 2002 04:40:34 UTC