- From: Bill Smith <bill.smith@sun.com>
- Date: Thu, 13 May 1999 11:50:37 -0700
- To: Steven Pemberton <Steven.Pemberton@cwi.nl>, w3c-xml-cg@w3.org
- Cc: w3c-html-wg@w3.org, www-html-editor@w3.org, w3c-xml-linking-wg@w3.org
Comments below are my personal views and may not be shared by other members of the XML Linking WG. This message has been copied to the XML Linking WG where I will encourage further discussion since it appears the HTML WG's decision to not adopt our proposal may have significant impact on XML linking specifically and the XML activity in general. At 10:30 PM 5/12/99 +0200, Steven Pemberton wrote: >HTML WG Comments on "Liaison statement on fragment identifiers from >Linking WG" Tue, 20 April 1999 > >http://lists.w3.org/Archives/Member/w3c-xml-cg/1999Apr/0061.html > >The HTML WG appreciates the input from the Linking WG, and discussed >it fully at our recent FtF. We entered the meeting believing we would >implement the suggested changes. > >However after some discussion we came to the realisation that the >suggestions were based on a misunderstanding of the role of XHTML 1.0. > >To summarise: > > The intention of XHTML 1.0 is that it is primarily an XML > application, and should be served as an XML application. > > However, we believe it is desirable that in the short term to > have a transition period, to ease the transition from HTML to > XHTML, and to leverage on the preponderance of old-HTML user > agents being used in the world. Thanks to a quirk in most old > user agents, if you follow a small number of guidelines you > can serve the XML version of HTML as (old) HTML to non XML > user agents. Based on the statements above, I doubt that there was any misunderstanding on the part of the XML Linking WG as to the HTML WG's intended role of XHTML. We are well-aware that XHTML 1.0 is intended to be an XML application. Further, we are aware that it is desirable to ease the transition from HTML to XHTML specifically or to XML generally. We are further aware that other users of XML (RDF in particular) have adopted syntax/semantics for XML fragment identifiers that are in conflict with XHTML's intended use. Several members of the XML Linking WG and IG participated in the original formulation of XML and the lengthy discussions on acceptance, transition, and market acceptance. If memory serves me, all linking WG/IG members that participated in those discussions, at one time or another argued strongly in favor of making accommodations in XML that would facilitate its acceptance by HTML users. We continue to endeavor to ease transition/acceptance but have an equal responsibility to ensure that XML and its related standards are general purpose and well-suited to broad application usage scenarios. The simple, but hard fact is that HTML 4.0 is not an XML application. In particular, HTML's reliance on the CDATA type NAME attribute as the referent of a fragment identifier makes the transition to XML difficult at best. As was described in our liaison document, XPointer intends to apply the semantic that "naked" fragment identifiers refer to elements that have a type ID attribute whose value matches that specified in the fragment identifier. I do not believe it possible to have structured fragment identifiers for XML if the referent attribute is of type CDATA. >Since XHTML is principally an XML application, we want to act as far >as possible like a normal XML application. (This is the reason we >thought we would be adopting the Linking WG's suggestions). This >includes using attributes of type ID as targets of fragment >identifiers. It is highly desirable if not imperative that XHTML behave like a normal XML application. HTML is arguably the most widely used markup language and its use is growing rapidly. If a transition from HTML to XHTML occurs, and I see no reason that it should not, XHTML could become the most widely used markup language based on XML. This would be an important step in the broad adoption of XML and would provide the basis for the development of a broad class of web-based applications. However, if XHTML is in some way "abnormal" with regard to XML, the HTML WG will have done a disservice to those responsible for the development of XML and the larger web community. XML is general purpose and we are endeavoring to develop similarly general purpose companion standards like XPointer and XLink to enable the broadest possible class of applications while allowing users the widest choice of language/vocabulary. I quote two paragraphs from our liaison document below: _____________ The Strawman 1 proposal requires less work, requiring only an adjustment to a single Working Draft (XPointer). However, it represents a sacrifice of simplicity and consistency in the general case in order to address a very specific problem. This is serious, since XML is simple and flexible enough that there is a real possibility of it serving as the basis for distributed hypertext for a very long time indeed. Strawman 2 requires co-ordinated updates to XHTML and to XML 1.0, and increases the conversion cost for those who want to serve legacy HTML as XML. However, it preserves the simplicity of the general case, and better meets the goal of "leading the Web to its full potential." _____________ In brief, our goal is to develop standards and recommendations for use that have broad applicability well into the future. We are well-aware of backwards compatibility issues and the need to provide transition strategies. However, mechanisms and recommendations that rely on quirks or application-specific semantics should be avoided. >Now, XHTML 1.0 is just an XMLised version of HTML 4.0, as close as >possible to HTML 4.0 taking into account the requirements of XML. As I mentioned above, HTML 4.0 is not an XML application. Further the application specific syntactic and semantic mechanisms employed in HTML 4.0 are at times at odds with the general purpose syntactic and semantic mechanisms being developed as companions to XML. As a consequence, the transition to XHTML 1.0 (as XML) may not be as simple as some predicted. This dissonance is quite normal and I've seen it many times before whenever I try to generalize an otherwise specific solution/application. Those things that were natural, easy, or obvious in the specific case become unnatural, difficult, or convoluted in the general case. Frequently, it is necessary to abandon the application specific shortcuts in favor of the general purpose mechanisms. >HTML 4.0 allows the use of both NAME (on A elements) and ID (on any >element) as the target for fragment identifiers. Here are some >extracts from the HTML 4 recommendation (http://www.w3.org/TR/REC-html40/): > > 7.5.2 Element identifiers: the id and class attributes > > The id attribute has several roles in HTML: > > As a style sheet selector. > As a target anchor for hypertext links. > ... > > 12.2 The A element > > name = cdata [CS] > This attribute names the current anchor so that it may be the > destination of another link. The value of this attribute must be > a unique anchor name. The scope of this name is the current > document. Note that this attribute shares the same name space as > the id attribute. > > 12.2.1 Syntax of anchor names > > An anchor name is the value of either the name or id attribute when > used in the context of anchors. Anchor names must observe the > following rules: ... > > 12.2.3 Anchors with the id attribute > > The id attribute may be used to create an anchor at the start tag of > any element (including the A element). > >In many ways this is a hack (and probably introduced to allow a >transition to using ID). We regard the NAME attribute of A elements as >a historical anomaly that should be phased out. One possibility is to phase it out with the transition to XML. It is possible (if not likely) that other things will break with this transition and now might be the best time to incur the pain of change. Another possibility would be to adopt Strawman 2 from the liaison document and accept the offer to help coordinate the requisite changes to various recommendations. >Since HTML documents have to be rewritten (or transformed) to put them >in XHTML form anyway, we decided to adhere to XML Linking suggestions >of using ID as targets, and not NAME anymore. However, to aid the >transition process mentioned above, we left NAME in the DTD for >documents that need to be served to old user agents. While I have significant concern for new documents served to old user agents, I have an equal concern for new documents served to new user agents. In particular, I am quite concerned that a new class of (XML) user agents will be developed that "understand" XHTML but these new user agents will support the syntax and semantics of HTML 4.0 in order to "aid the transition process". This "quirk" will have long-lasting and far-reaching impact. As a consequence, the semantics of a "naked" XML fragment identifier will be to refer to an element that has either a type ID attribute that matches the fragment identifier or to an element that has a type CDATA attribute NAME whose value matches the fragment identifier. This effectively precludes the use of the attribute name NAME in any XML vocabulary unless the user intention is for the specific semantic specified in HTML 4.0. I can imagine many other intended uses for the an attribute named NAME and it would be unfortunate to limit XML's otherwise open vocabulary policy. >Our chief reasons therefore for not adopting the Linking WG's >suggestions are: > > We want to phase NAME out in the longer term. We think that > fragment identifiers should refer to the ID attribute. > > If something is called ID, it should be of type ID. > > Making NAME to be of type ID, and ID to be of another type > would break the use of CSS on XHTML documents, and create a > semantic difference in how CSS works on a document served as > HTML and XHTML. > >This is also the reason we provide a guideline to use both NAME= and >ID= on A elements when you wish to serve XHTML as HTML, to ensure >consistent behaviour across user agents. Deprecation is always best when done over the long-term as opposed to the short-term. However, this is not simply deprecating an attribute of historical anomoly. Moving the syntax/semantics associated with the HTML 4.0 attribute NAME into general XML usage (whether explicit or implied) will have long-lasting and far-reaching consequences. It is my belief that the guideline/admonition will be generally ignored by users and as a consequence, application developers will ascribe the HTML 4.0 semantic to the NAME attribute for all XML documents. This would be most unfortunate and I urger the HTML WG to reconsider this decision. An option might be to register XHTML as it's own mime-type rather than as generic XML. > >Steven Pemberton >May 1999 >
Received on Thursday, 13 May 1999 14:52:29 UTC