Copyright © 2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
XPointer [XPointer] is intended to be the definition of the fragment identifier language for the four main XML-related Internet media types [IETF RFC 3023]. As such, it has to be parsed whenever there is a fragment identifier on a URI that refers to an XML resource. Because fragment identifier processing is such a basic part of XML on the Web and needs to be efficiently supportable on a wide range of devices and applications, this proposal suggests a small subset of XPointer functionality, called FIXptr, for this purpose.
This document has no official status at this time. It is a proposal offered for consideration by some of the members of the XML Linking Working Group. We request that the Working Group consider reducing XPointer to FIXptr size and issuing a third Last Call Working Draft in order to attract implementors and vendors, the goal being to ensure a W3C Recommendation whose deployment as a fragment identifier language is successful. Additional features could be part of a higher level of conformance (for non-user-agent purposes) or a follow-on version if this more modest feature set is successful. B Rationale describes the reasoning behind the proposed design.
1 Introduction
2 Terminology
3 Conformance
4 FIXptr Language
4.1 Syntax
4.2 Informal
Semantics
4.3 Semantics in Terms of
the Infoset
4.3.1 Name
4.3.2 Initial Selector
4.3.3 Child Sequence
4.3.4 Character Offset
A References
A.1 Normative
References
A.2 Informative
References
B Rationale (Non-Normative)
B.1 The Role of Fragment
Identifiers
B.2 Data Model
B.3 Feature
Set
B.4 Doing Ranges with
FIXptr
C Scenarios for Pointing into XML
(Non-Normative)
C.1 HTML Hyperlinking into
an XML Document
C.2 Compiling Issues into
an Issue List
C.3 Annotating a Document
with Corrections
D Questions and Answers (Non-Normative)
E Production Notes (Non-Normative)
XPointer [XPointer] is intended to be the definition of the fragment identifier language for the four main XML-related Internet media types [IETF RFC 3023]. As such, it has to be parsed whenever there is a fragment identifier on a URI that refers to an XML resource. Because fragment identifier processing is such a basic part of XML on the Web and needs to be efficiently supportable on a wide range of devices and applications, this proposal suggests a small subset of XPointer functionality, called FIXptr, for this purpose.
This proposal specifies the addressing of any single element or character in the resource. Addressing is done with three constructs:
an optional initial ID,
a "child sequence" or "tumbler" that gives a path that descends the element tree,
and an optional terminal character offset.
The language defined in this proposal offers a subset of the features that XPointer offers. To aid in clear description, it is given a different name: Fragment Identifier for XML (FIXptr). FIXptr is specified in terms of the XML Information Set [Infoset].
B Rationale describes the reasoning behind the proposed design. C Scenarios for Pointing into XML provides some examples of how it would work.
[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [IETF RFC 2119].]
The information set of a document as defined by [Infoset].
An item of information in a document's infoset.
A string, expected to be used as a fragment identifier, that satisfies the FIXptr language described in this proposal.
A software component that takes as input an infoset and a FIXptr and produces as output an identification of an information item in that infoset.
A software component that incorporates or uses a FIXptr processor because it needs to operate on URI references that are likely to refer to XML resources. For example, a Web browser or an XInclude [XInclude] processor might be such an application.
A violation of the rules of this specification; results are undefined.
FIXptr processing normatively depends on [IETF RFC 2396] (as updated by [IETF RFC 2732]) processing, including character escaping as defined in these RFCs.
FIXptr processing normatively expects as input infosets that have at least the following information items and properties:
[document element] property
[attributes] property
[children] property
[attribute type] property
[normalized value] property
It is an error if a FIXptr does not adhere to the syntactic requirements described in this specification.
A fragment identifier conforms to this specification if it does not satisfy any of the conditions for being in error as described in this specification.
Conforming FIXptr processors should report errors back to the application. Applications may terminate or recover from FIXptr errors in any way they choose.
This section describes the FIXptr language.
All FIXptrs match the following production (where Name is as defined in the XML Recommendation [XML]):
[1] | fixptr |
::= | (Name
| init-selector) child-seq? char-offset? |
[2] | init-selector |
::= | '/1' |
[3] | child-seq |
::= | ('/' [1-9] [0-9]* )+ |
[4] | char-offset |
::= | '(' [1-9] [0-9]* ')' |
In other words, the name, child sequence, and character offset
are all optional, but if the name is omitted, there must be a
non-null child sequence and it must start with /1
.
The optional initial Name locates the element in the document that has an ID-typed attribute whose value is the given name. The following child sequence locates an element by stepwise navigation using a sequence of integers separated by slashes (/); each integer n locates the nth child element of the previously located element. The final optional parenthesized integer gives a character offset; an offset n locates the nth child character node of the previously located element.
For example, the FIXptr intro/3/1/4(6)
first
locates the element with an ID attribute that has the value
"intro", then locates the third child element of the "intro"
element, then locates that element's first child element, then that
element's fourth child element, then that element's sixth character
child. Note that a FIXptr consisting of just a name provides, for
resources with XML media types, an analog of the HTML fragment
identifier behavior.
A FIXptr provides a way to specify a reference into a Web resource of Internet media type text/xml or application/xml using a URI reference. A URI reference containing a FIXptr as its fragment identifier specifies both a resource and a FIXptr specification. It is an error to put a FIXptr fragment identifier in a URI reference that specifies a resource that does not have an XML infoset. This specification, therefore, is defined to address into an infoset. More specifically, all FIXptrs address either a single element information item or character information item in the infoset of the specified resource. The rest of this section describes just which information item within the specified resource's infoset is located by a given FIXptr specification.
The Name component locates the unique element information item whose [attributes] property (which is an unordered set of attribute information items) contains an attribute information item whose [attribute type] property has the value “ID” and whose [normalized value] property matches the given name. It is an error if no such element information item exists or if more than one such element information item exists.
Note:
Since the Name component could be any string that satisfies XML's Name production, it might need to contain characters that cannot be represented directly in the current document's encoding and/or that may not be represented in unescaped form in a URI reference. Therefore, though a string does satisfy the syntax requirements for FIXptr, it might not be usable in its unescaped form as a fragment identifier in some circumstances. Appropriate escaping methods are described in [IETF RFC 2396] (as updated by [IETF RFC 2732]) . The W3C Character Model specification [CharMod] is also informative in this area.
If the Name component is omitted, then a FIXptr must start with the init-selector component, which locates the unique element information item in the [document element] property of the infoset's document information item. It is an error if no such element information item exists.
The child-seq component locates an element information item by stepwise navigation using a sequence of unsigned positive integers separated by slashes. For each integer n in the child-seq, the nth element information item in the [children] property of the currently located element information item is located. For the first integer of the child-seq, the currently located element information item is that one located by the Name or init-selector component. It is an error if, for any integer n in the child-seq, there is no nth element information item in the [children] property of the currently located element information item.
Note:
In locating the nth element information item in the [children] property of an element information item, all other information items within the [children] property are ignored and do not affect the counting.
The char-offset component consists of a parenthesis-enclosed unsigned positive integer. For an integer n, the nth character information item in the [children] property of the currently located element information item is located. It is an error if no such character information item exists.
Note:
In locating the nth character information item in the [children] property of an element information item, all other information items within the [children] property are ignored and do not affect the counting.
Note:
The W3C Character Model specification [CharMod] recommends early character normalization. Given that character normalization has occurred prior to or during infoset generation, the process of locating the nth character information item in the [children] property of an element information item is unambiguously defined.
This section provides reasons supporting the choices of language scope and design made in this proposal.
The fragment identifier mechanism defined in [IETF RFC 2396] is specifically targeted at the client side of Internet data handling, that is, user agents. A client-side application is handed a whole resource, within which it needs to find the appropriate fragment (subresource) in order to continue its processing. (See C Scenarios for Pointing into XML for more information on this.)
It is likely that many XML-based applications in this position will be lightweight, especially given the trend towards Web-enabled PDAs and mobile phones. A fragment identifier language for XML data, just as for any Internet media type, is likelier to be widely deployed and interoperable if it requires a more modest investment of resources.
Thus, this proposal offers an XML fragment identifier design that can be:
Easily and quickly implemented
Tested easily for conformance
Efficiently implemented on small devices
Implemented using streaming technology
Neatly integrated as a small module into a variety of higher-level applications of varying sophistication
Specifically, this proposal takes the view that any features of a fragment identifier language that can be characterized as general-purpose "querying" for information rather than "pointing" to known information place too much of a burden on applications.
The design offered in this proposal is based on the [Infoset], rather than the XPath data model, for the following reasons:
The Infoset is explicitly designed for the purpose for providing an unambiguous description of XML data, which makes it easy to express precisely what subresource is being pointed to. The choice of the Infoset lets both the specification in this proposal and the higher-level specifications that might refer to it use concise, formal language in describing the desired behavior.
The Infoset natively allows character-by-character access. Locating text content is particularly important for hypertext and annotation, two important motivations for an XML pointer language.
If a document is governed by a schema whose validation process populates the [attribute type] property of the Infoset, element IDs can be recognized even when DTDs are not used.
Every XML document has an infoset, regardless of the document's use of external parsed entities to break up its physical structure. Thus, the Infoset provides a bridge between the world of Internet resources and Internet entity bodies, on the one hand, and the world of XML documents and XML entities, on the other.
Because the Infoset does not define infosets for external parsed
entities all by themselves, this proposal accounts only for
addressing into Internet media types text/xml
and
application/xml
, not
text/xml-external-parsed-entity
and
application/xml-external-parsed-entity
. The
relationship between physical Internet resources and logical XML
documents needs more study before it can be completely harmonized;
until that time, other mechanisms can be used to handle arbitrary
external parsed entities.
This proposal takes the view that a very minimal fragment identifier language supports the bulk of pointing needs (with higher-level applications picking up the burden of actually doing something with the XML information so identified). We make the following observations:
Most linking applications, such as Fujitsu's [Fujitsu], use [XPointer] only to point to elements. No real-world examples have been found of XPointers that address attributes, comments, processing instructions, or namespaces.
Linking applications have typically implemented only bare names
and (in some cases) child sequences, and not the more sophisticated
ways of pointing to elements. SVG [SVG], which
borrows from XPointer to define its own fragment identifier
language, borrows only two aspects from XPointer: bare names and
the equivalent XPath-compatible id
function.
XPointers are typically generated, not hand-coded, and generated XPointers tend not to take advantage of built-in knowledge of the schema being used in order to avoid future breakage. The generation algorithm tends to be: use the element's ID; if none, go up to the nearest ancestor, use its ID, then walk down with tumblers; walk down all the way from the root if necessary. (In fact, this algorithm is recommended in the Synthesizing XPointers section of the XLink-to-RDF note [XLink2RDF] for purposes of standardizing on a way to create consistent RDF statements from XLink links.)
These observations suggest strongly that the main requirement is to address elements, which is not surprising since elements are the primary objects in XML documents. This proposal keeps the two most well-supported ways to point to elements in XPointer, using syntax identical to XPointer's bare name and child sequence short forms.
The next most important class of objects in XML documents is the text contained inside elements. The need for pointing to pieces of text can be demonstrated by the following observations:
Most HTML links have, as their starting resource (using XLink
terminology), a word or phrase that is undistinguished except for
the <A HREF=>
markup itself.
XLink provides an important facility, third-party links, that allows such words and phrases to be identified remotely, even in the absence of surrounding markup.
This proposal provides a way to point to individual characters using the simple mechanism of counting characters that are children of the current element, analogous to counting elements in child sequences.
See B.4 Doing Ranges with FIXptr for information on "pointing" to ranges of characters.
Designing an XML fragment identifier language to point to elements and characters seems the right 80/20 point given the goals stated in B.1 The Role of Fragment Identifiers. Putting the burden of potentially unbalanced or disjoint ranges, multiple targets, and more complex "query-like" capabilities onto higher-level applications seems the better architectural solution.
While the design proposed here does not natively support ranges, a pair of pointers could describe any arbitrary range, and higher-level applications could interpret these pairs and interpolate range content as appropriate.
For example, the following hypothetical schema fragment for a
modified XLink would allow pointers (in URI references) to occur in
pairs in an xlink:href
attribute:
<xsd:simpleType name="listOfURIRefs"> <xsd:list itemType="uriReference" /> </xsd:simpleType> ... <xsd:simpleType name="XLinkHref"> <xsd:annotation> <xsd:documentation xml:lang="en"> All character info items between the first and last, inclusive, when a pair is provided </xsd:documentation> </xsd:annotation> <xsd:restriction base="listOfURIRefs"> <xsd:minLength value="1" /> <xsd:maxLength value="2" /> </xsd:restriction> </xsd:simpleType> ... <xsd:attribute name="href" type="XLinkHref" /> ... |
For this example, assume the following target document,
doc.xml
:
<?xml version="1.0"?> <doc> <p>Click here to go elsewhere.</p> </doc> |
An XLink extended link could make the word "here" a starting resource as follows:
<extended-link xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ixlink:="http://www.w3.org/imaginary/xlink" xlink:type="extended" xml:base="http://www.example.com/"> <starting-resource xlink:type="locator" ixlink:href="doc.xml#/1/1(7) doc.xml#/1/1(10)" xlink:label="start" /> <ending-resource xlink:type="locator" ixlink:href="http://elsewhere.xml" xlink:label="end" /> <arc xlink:type="arc" xlink:from="start" xlink:to="end" xlink:show="replace" xlink:actuate="user" /> </extended-link> |
In this case, each pointer in the pair identifies a different character inside the same element. However, pairs of pointers could also be used to describe arbitrary ranges, both balanced and unbalanced. Higher-level applications might very well want to impose different rules for each pointer in the pair. For example, XInclude might want to insist that the pointers point to elements only (not characters), and that the elements be siblings of each other. XLink might not want to impose any constraints.
Higher-level applications that choose to allow pairs of pointers will also need to specify normatively which information items "between" and/or "around" the ones identified are to be included in processing. For example, since XInclude performs an infoset-to-infoset transformation, it might want to select all information items of all types between the two identified elements, inclusive. If XLink allows unbalanced pointer pairs, it will need to specify a somewhat complex algorithm for identifying the selected information items, or alternatively it could specify a covering range-style algorithm.
Pushing this calculation off to higher-level applications allows FIXptr to remain simple and allows the individual applications to specify exactly the constraints and semantics they desire.
As defined in [IETF RFC 2396], an
Internet media type identifies a certain class of resources, and
resources of that type can have a fragment identifier language
associated with them. HTML is the only widely supported Internet
media type that has a fragment identifier language, the familiar
#name
construction. Since HTML's primary associated
application is the browser, pointers into HTML fragments tend to
appear mostly in HTML documents, and they tend to be used mostly
for the hypertext scenario: You point into a particular location in
an HTML file because you want the browser to scroll to that point
before displaying the page.
However, it is possible to separate the hyperlinking and pointing functions, as XLink and XPointer have demonstrated. This is important because XML processing is likely to include many applications that do not do hyperlinking but nonetheless want to do pointing (such as XInclude).
Following are some examples of ways in which various
applications might interact with FIXptr processors. They all assume
the following target XML document, in a file called
footspec.xml
:
<?xml version="1.0"?> <!DOCTYPE spec [ <!ATTLIST issue id ID #REQUIRED> ]> <spec> <title>Specification for the Footwear Manufacturers' Markup Language</title> <div1><title>Introduction</title> <p>In this introudction, we list the scope of FMML:</p> <ulist> <li><p>Footwear sizes</p></li> <li><p>Footwear prices</p></li> <li><p>Footwear colors</p></li> </ulist> <issue id="scope-update">check this list against the charter!</issue> </div1> ... </spec> |
An HTML document contains a link to the introduction of the FMML specification that looks like this:
<P>The <A HREF="footspec.xml#/1/2">introduction to FMML</A> describes the scope of that language.</P> |
The pointing is accomplished with a child sequence, because the
section in question doesn't have an ID on it. If the link were into
an HTML document, there would have had to be an <A
NAME=>
available.
Also notable is that a pointer into an equivalent HTML document wouldn't have been able to point to the desired content, only a position near the content of interest, in order to get user agents to scroll to that position. In this case, the actual content (the element containing the whole section) is identified.
To date, HTML hyperlinking/pointing has tended to be implemented as a monolithic system, rather than as a modular part of Internet architecture. Thus, browsers are probably not prepared to handle HTML hyperlinks containing URIs-plus-fragment identifiers to any media types other than HTML itself. In the absence of any HTML specification that describes proper behavior for hyperlinking into XML, browsers that support both HTML and XML could implement some kind of default behavior on encountering an HTML link into an XML document, which has the option of being a bit more sophisticated than just scrolling -- for example, highlighting or typographically bracketing the section.
Note that it is not the FIXptr processor that does the highlighting or scrolling; this would be done by the browser application, which has been directed to the element information item of interest by the FIXptr processor.
XInclude is used to populate an FMML "issue list" document with all the issue elements that appear in the actual specification.
<?xml version="1.0"?> <issues-list xmlns:xinclude="http://www.w3.org/1999/XML/xinclude"> <title>FMML Issues</title> <p>The following are open issues. Please be prepared to discuss them at the next FMML meeting. </p> <xinclude:xinclude href="footspec.xml#scope-update" /> ... </issues-list> |
Because issue elements can be counted on to have IDs, the corresponding FIXptrs can use them, and the issues list will typically need to be updated only when issues are added or deleted.
Here, it is the XInclude application that provides the transformation capability. The FIXptr processor is handed the infoset for the whole specification document each time an XInclude instruction is encountered, and it merely locates each of the desired elements in turn.
A specialized annotation language is used to record a reviewer's corrections to the specification (using a fictitious Review Tool), so that the editor can use the corresponding fictitious Update Tool application to accept or reject them. The annotations look like this:
<annotations> <annotation> <problem-loc> <loc-start>footspec.xml#/1/2/2(9)</loc-start> <loc-end>footspec.xml#/1/2/2(20)</loc-end> </problem-loc> <replace-with>introduction</replace-with> </annotation> ... </annotations> |
Here, the URI references in which the FIXptrs appear are in element content because that's how the Update Tool and its markup language happen to work.
The Update tool might require that both FIXptrs in the pair specify character information items and not element information items, because reviewers are allowed to suggest only textual changes and not markup changes. However, the Update Tool semantics might still allow unbalanced pairs, with an interpretation of ranges that corresponds to all elements "touched," plus all the characters to the "right" of the first FIXptr in the pair and to the "left" of the second. (In practice, a much more precise accounting of these semantics and the Update Tool interface would be needed!)
There is only one implementation of XPointer that is known to be complete, and no browser vendors have adopted it. Most implementations are partial: either bare names (sometimes with child sequences), or plain XPath without the XPointer data model extensions.
Since new technologies tend to be deployed to user agents over a period of several years, a modest design seems much more likely to be adopted widely and implemented correctly than an ambitious one. A reduced XPointer may finally attract vendors who want to claim conformance.
To allow for element naming (so that you could, for example,
locate the fifth p
element, not just the fifth element
overall) would require quite a lot more infrastructure to
accommodate namespace-qualified elements, without providing
equivalent benefit. For example, without specialized schema
knowledge, naming elements is no more likely to protect
machine-generated addresses against pointer breakage. Thus, it
seems to fall beyond the 80/20 point.
Ranges are an interesting and important scenario, as demonstrated in the examples above, and they are useful for many different applications that operate on URI references.
However, building ranges directly into a fragment identifier language makes quite a processing demand on user agents because an interpolation must happen in order to pick up the content between the starting and ending points of the range; not every piece of content is literally pointed to. It is better to push this processing off to higher layers if possible.
Furthermore, different applications might desire different behavior when start/end points of a range are specified. For example:
<p>A <em>big</em> tree.</p> |
The word "A" is at character information item position 1, and
the letter "t" is at character information item position 4. If you
specify (using any syntax) a range between characters 1 and 4, you
face various choices as to how to handle the subelement
em
and its content, and the "right" answer may differ
depending on the type of application and the kinds of information
items it cares about. It's certainly difficult to pick a single
interpretation that will be intuitive to everybody! Since (as
explained in B.4 Doing Ranges with
FIXptr and shown in C Scenarios for
Pointing into XML) it is possible for any application to
add its own customized range support and define it unambiguously in
terms of the Infoset, it seems better to hold off on adding this
feature to the fragment identifier language. If the Infoset ever
finds it useful to define some common notion of "range," that would
probably be the time to include it in a pointer language.
href
attribute have to
change to allow for pairs of pointers, as shown above, if
XPointer's scope is reduced?No. If XLink did not change to allow for pairs of pointers, it could not point to any ranges, but it could still point to elements, and it could point to single characters, which is enough (barely) for an application to attach third-party linking behavior to. (Of course, XLink simple-type and resource-type elements can still be used to surround any range of XML content desired.)
XPointer's current lack of deployment means that XLink applications can't rely on full XPointer support right now anyway. And most known XLink applications (which tend to be enterprise-wide, not Web-wide) have chosen to implement the XPointer subset corresponding to the FIXptr feature set or even slightly less, so they would not be missing the additional features.
If reducing XPointer's feature set could entice vendors to support it, then XLink, as a Web-wide XML-based hypertext language, would in practical terms be ahead of where it is now, even if it did not change to accommodate pairs. This is generally true of every existing application that can make interesting use of XML subresources, not just XLink.
This proposal was encoded in the XMLspec DTD [XMLspec]. The HTML version was produced with the corresponding XSLT stylesheet.