- From: Shane P. McCarron <ahby@themacs.com>
- Date: Tue, 18 May 1999 07:37:22 -0500
- To: Bill Smith <bill.smith@sun.com>
- CC: shane@themacs.com, Tim Bray <tbray@textuality.com>, Steven Pemberton <Steven.Pemberton@cwi.nl>, w3c-xml-cg@w3.org, w3c-html-wg@w3.org, www-html-editor@w3.org, w3c-xml-linking-wg@w3.org
Bill Smith wrote: > It may be that I misunderstand how this technology works but I fail to see > how HTML tidy, when run over a single document instance, will cause all > referring URLs (from other documents) to be properly updated. A simple > example: > > In HTML 4.0 the following is legal: > <A NAME="bill's-address"> > > In XHTML 1.0 this becomes (with the help of a tool like HTML tidy) > <A NAME="bill's-address" ID="bill-0039s--address"> Okay - Good example. I get it now. And I see how this might be an issue if implementors and users use bad fragment identifiers. However, we cannot define behavioral requirements for incorrect implementations or incorrect usage. Let me be clear: HTML 4.0 requires that the NAME and ID attributes share a namespace. It further defines the ID attribute as type ID. While the DTD does not explicitly define the type ID, I infer it to mean the SGML/XML definition of that type. This is backed up by the fact that the HEADERS attribute of the TD element and the FOR attribute of the LABEL element are both of type IDREF. Therefore, the allowable set of identifiers in HTML 4.0 for the ID attribute should, necessarily, be the same as the allowable set of identifiers for the NAME attribute in HTML 4.0. Even though NAME is declared as CDATA, its set needs to be constrained by the set for ID or fragment references from LABEL and TD would not work as expected. So, even though your example was likely legal in HTML 3.2, and might even work in some implementations, I do not believe it is legal usage in HTML 4.0. In any event, I am happy to assert that it is NOT legal usage through prose to that effect, or through a redeclaration of the NAME attribute as some other type that makes it clear it as a restricted portable character set (if such a type can be found). > I've used "-" as an escape character in this example. It's a valid > character in attribute values of type ID and should allow us to manually > translate CDATA NAME atribute values to ID ID attribute values. If I've > thought about this correctly, I now have a document instance that can be > served as HTML or XHTML. Sort of. HTML and XHTML require that the NAME and ID namespaces be shared. To me this means (also) that the NAME and ID attributes of a single element must necessarily be the same. However, this is not explicit in HTML 4.0. We could certainly make it explicit in XHTML 1.0 if that would assist with translation. Anyway, I don't think that your example meets this requirement. It is not an Strictly Conforming XHTML 1.0 Document, as we have it defined. Therefore, the behavior of user agents that process it is unspecified. > But all of the documents that refer to this document instance will still > have fragment IDs of the form "bill's-address" and these fragment IDs will > fail when the resource retrieved is of type XML - unless some form of > fallback to HTML 4.0 behavior is specified for all XML. > > Basically, the transition (for fragments IDs) from an attribute of type > CDATA to one of type ID will be problematic unless XML-generic processing > of these fragment IDs follows HTML 4.0 specific semantics. Webs of > documents will possibly cease to function properly when the conversion > occursor at some point in the future after everyone working on the > conversion has moved on. Well - there may be that danger. However, I don't at this time see an easy way around it other than making it clear that the set of characters that can be used in NAME are restricted to the set that can be used in an ID. I believe this is what was meant by HTML 4.0, and I know this is what we mean in XHTML 1.0. Would making such a restriction explicit help assuage your concerns? -- Shane P. McCarron phone: +1 612 434-4431 Testing Research Manager fax: +1 612 434-4318 mobile: +1 612 799-6942 e-mail: shane@themacs.com OSF/1, Motif, UNIX and the "X" device are registered trademarks in the US and other countries, and IT DialTone and The Open Group are trademarks of The Open Group.
Received on Tuesday, 18 May 1999 08:37:18 UTC