Fragment Identifier for XML (FIXptr)

Proposal 25 April 2001

This version:
Paul Grosso, Arbortext <>
Eve Maler, Sun Microsystems <>
Norman Walsh, Sun Microsystems <>


XPointer [XPointer] is intended to be the definition of the fragment identifier language for the four main XML-related Internet media types [IETF RFC 3023]. As such, it has to be parsed whenever there is a fragment identifier on a URI that refers to an XML resource. Because fragment identifier processing is such a basic part of XML on the Web and needs to be efficiently supportable on a wide range of devices and applications, this proposal suggests a small subset of XPointer functionality, called FIXptr, for this purpose.

Status of this Document

This document has no official status at this time. It is a proposal offered for consideration as a replacement or minimal conformance level for XPointer. (Note that the XML Linking Working Group took up this question on 19 April 2001 and voted down the proposal.)

B Rationale describes the reasoning behind the proposed design.

Table of Contents

1 Introduction
2 Terminology
3 Conformance
4 FIXptr Language and Processing
    4.1 Syntax
    4.2 Informal Semantics
    4.3 Semantics in Terms of the Infoset
        4.3.1 Name Component
        4.3.2 Initial Child Component
        4.3.3 Child Component
        4.3.4 Character Offset Component
        4.3.5 Pointer Pair


A References
    A.1 Normative References
    A.2 Informative References
B Rationale (Non-Normative)
    B.1 The Role of Fragment Identifiers
    B.2 Data Model
    B.3 Feature Set
    B.4 Doing Ranges with FIXptr
C Scenarios for Pointing into XML (Non-Normative)
    C.1 HTML Hyperlinking into an XML Document
    C.2 Compiling Issues into an Issue List
    C.3 Annotating a Document with Corrections
D Production Notes (Non-Normative)

1 Introduction

XPointer [XPointer] is intended to be the definition of the fragment identifier language for the four main XML-related Internet media types [IETF RFC 3023]. As such, it has to be parsed whenever there is a fragment identifier on a URI that refers to an XML resource. Because fragment identifier processing is such a basic part of XML on the Web and needs to be efficiently supportable on a wide range of devices and applications, this proposal suggests a small subset of XPointer functionality, called FIXptr, for this purpose.

This proposal specifies the addressing of any single element or character in the resource, or a pair of elements and/or characters. Addressing is done with three constructs:

  1. an optional initial ID,

  2. a "child sequence" or "tumbler" that gives a path that descends the element tree,

  3. and an optional terminal character offset.

The language defined in this proposal offers a subset of the features that XPointer offers. To aid in clear description, this language is given a different name: Fragment Identifier for XML (FIXptr). FIXptr is specified in terms of the XML Information Set [Infoset].

B Rationale describes the reasoning behind the proposed design. C Scenarios for Pointing into XML provides some examples of how it would work.

2 Terminology

[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended , may, and optional in this specification are to be interpreted as described in [IETF RFC 2119].]

[Definition: pointer]

A FIXptr string that locates a single information item. This proposal normatively defines the syntax of pointers.

[Definition : pointer pair]

A FIXptr-conforming string containing two pointers that each locate a single information item. This proposal normatively defines the syntax of pointer pairs.

[Definition : FIXptr processor]

A software component that takes as input an infoset and a FIXptr pointer or pointer pair and produces as output an identification of exactly one or two information items in that infoset. This proposal normatively defines the behavior of a FIXptr processor.

[Definition: application]

A software component that incorporates or uses a FIXptr processor because it needs to access XML resources by means of URI references. The occurrence and usage of URI references are governed by the definition of each application's corresponding data format (which could be XML-based or non-XML-based). For example, both HTML Web browsers and XInclude [XInclude] processors are applications.

[Definition: error ]

A violation of the rules of this specification; results are undefined.

[Definition: infoset]

The information set of a document as defined by the XML Information Set specification [Infoset].

[Definition : information item]

An item of information in a document's infoset.

[Definition : context item]

The information item that is located by one component of a pointer, and relied on as the context for locating the next (rightward) component of the pointer.

3 Conformance

FIXptr processing normatively depends on [IETF RFC 2396] (as updated by [IETF RFC 2732]) processing, including character escaping as defined in these RFCs.

FIXptr processing normatively expects as input infosets that have at least the following information items and properties:

It is an error if a FIXptr string does not adhere to the syntactic requirements described in this specification.

A fragment identifier conforms to this specification if it does not satisfy any of the conditions for being in error as described in this specification.

Conforming FIXptr processors should report errors back to the application. Applications may terminate or recover from FIXptr errors in any way they choose.

4 FIXptr Language and Processing

This section describes the FIXptr language and the behavior of FIXptr processors.

4.1 Syntax

All FIXptr strings match the following production (where Name is as defined in the XML Recommendation [XML]):

[1]    fixptr    ::=    ptr (',' ptr)?
[2]    ptr    ::=    (Name | init-child) child* char-offset?
[3]    init-child    ::=    '/1'
[4]    child    ::=    '/' [1-9] [0-9]*
[5]    char-offset    ::=    '(' [1-9] [0-9]* ')'

In other words, the name, child sequence, and character offset are all optional, but if the name is omitted, the child sequence must be non-null and must start with /1.

4.2 Informal Semantics

When the FIXptr string contains a single pointer, the optional initial Name locates the element in the document that has an ID-typed attribute whose value is the given name. The following child sequence locates an element by stepwise navigation using a sequence of integers separated by slashes (/); each integer n locates the nth child element of the previously located element. The final optional parenthesized integer gives a character offset; an offset n locates the nth child character node of the previously located element.

For example, the pointer intro/3/1/4(6) identifies a character as follows: It first locates the element with an ID attribute that has the value "intro", then locates the third child element of the "intro" element, then locates that element's first child element, then that element's fourth child element, then that element's sixth character child. Note that a pointer consisting of just a name provides, for resources with XML media types, an analog of HTML fragment identifier behavior.

When the string contains a pointer pair, each pointer identifies a single element or character as described above. For example, the pointer pair /1/5/12,/1/5/18 identifies the twelfth and eighteenth child elements inside the fifth child element inside the root element of the document. Applications may use the located content as anchors for deriving a larger selection of content (such as a "range" of seven elements encompassing the twelfth through the eighteenth inclusive) on which they then operate.

4.3 Semantics in Terms of the Infoset

A FIXptr string containing a pointer or a pointer pair specifies, respectively, one or two references into the content of a Web resource of Internet media type text/xml or application/xml using a URI reference [IETF RFC 2396], by serving as a fragment identifer for such a resource.

A FIXptr processor takes as input the infoset of the identified resource and a FIXptr string. It is an error for a FIXptr fragment identifier to appear in a URI reference that specifies a resource that does not have an XML infoset.

A FIXptr processor produces as output an identification of exactly one or two element information items and/or character information items in that infoset. For each pointer supplied as input, the information item identified is the item located by the final (rightmost) component of the pointer.

The rest of this section describes just which information items are identified by a given FIXptr string.


An application may use a data model that is richer than or different from the Infoset. For example, an editing application might interpret a pointer as corresponding to a cursor position between two elements, or a pointer pair as corresponding to a "user selection." In this situation, it is recommended that the normative documentation for an application specify the relationship between information items and the constructs of interest.

4.3.1 Name Component

The Name component locates the unique element information item in the input infoset whose [attributes] property (which is an unordered set of attribute information items) contains an attribute information item whose [attribute type] property has the value "ID" and whose [normalized value] property matches the given Name. It is an error if no such element information item exists or if more than one such element information item exists.


Since the Name component could be any string that satisfies XML's Name production, it might need to contain characters that cannot be represented directly in the current document's encoding and/or that are not allowed to be represented in unescaped form in a URI reference. Therefore, though a string does satisfy the syntax requirements for FIXptr, it might not be usable in its unescaped form as a fragment identifier in some circumstances. URI reference escaping methods are described in [IETF RFC 2396] (as updated by [IETF RFC 2732]) . The W3C Character Model specification [CharMod] is also informative in this area.

4.3.2 Initial Child Component

If the Name component is omitted, then a pointer must start with the init-child component, which locates the unique element information item in the [document element] property of the infoset's document information item. It is an error if no such element information item exists.

4.3.3 Child Component

Each child component locates an element information item by stepwise navigation using an unsigned positive integer preceded by a slash. For each integer n appearing in a child sequence component, the nth element information item in the [children] property of the currently located element information item (the context item) is located. For the integer in the first child, the context item is that one located by the Name or init-child component. It is an error if, for any integer n in the sequence, there is no n th element information item in the [children] property of the context item.


In locating the nth element information item in the [children] property of an element information item, all other information items within the [children] property are ignored and do not affect the counting.

4.3.4 Character Offset Component

The char-offset component consists of an unsigned positive integer enclosed in parentheses. For an integer n, the nth character information item in the [children] property of the context item is located. It is an error if no such character information item exists.


In locating the n th character information item in the [children] property of an element information item, all other information items within the [children] property are ignored and do not affect the counting.


The W3C Character Model specification [CharMod] recommends early character normalization. Given that character normalization has occurred prior to or during infoset generation, the process of locating the nth character information item in the [children] property of an element information item is unambiguously defined.

4.3.5 Pointer Pair

In the case of a pointer pair, the FIXptr string identifies two information items, one for each pointer.


An application may define pointer pair constraints (application-level errors) beyond the syntax imposed by this proposal. An application may also derive from the pointer pair a set of one or more information items or other constructs (such as a "range" based on the pointer pair as "starting" and "ending" points) on which the application will operate. It is recommended that the normative documentation for an application or its corresponding data format specify the constraints it imposes and its intended derivation.

A References

A.1 Normative References

Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, and Eve Maler, editors. Extensible Markup Language (XML) 1.0 (Second Edition). World Wide Web Consortium, 2000. (See .)
John Cowan and Richard Tobin, editors. XML Information Set. World Wide Web Consortium, 2001. (See
RFC 2119: Key words for use in RFCs to Indicate Requirement Levels . Internet Engineering Task Force, 1997. (See
RFC 2396: Uniform Resource Identifiers. Internet Engineering Task Force, 1995. (See .)
RFC 2732: Format for Literal IPv6 Addresses in URL's. Internet Engineering Task Force, 1999. (See

A.2 Informative References

Martin J. Duerst et al., editors. Character Model for the World Wide Web 1.0. World Wide Web Consortium, 2001. (See
Fujitsu XLink Processor description, 2000. (See
RFC 3023: XML Media Types. Internet Engineering Task Force, 2001. (See .)
Jon Ferraiolo, editor. Scalable Vector Graphics (SVG) 1.0 Specification . World Wide Web Consortium, 2000. The Linking chapter has information on SVG's fragment identifier language. (See
Jonathan Marsh and David Orchard, editors. XML Inclusions (XInclude) Version 1.0. World Wide Web Consortium, 2000. (See
Ron Daniel, editor. Harvesting RDF Statements from XLinks. World Wide Web Consortium, 2000. (See
Eve Maler and Norm Walsh, maintainers. XMLspec DTD and stylesheets. World Wide Web Consortium, 2001. (See
Steve DeRose, Eve Maler, and Ron Daniel, editors. XML Pointer Language (XPointer) Version 1.0. World Wide Web Consortium, 2001. (See

B Rationale (Non-Normative)

This section provides reasons supporting the choices of language scope and design made in this proposal.

B.1 The Role of Fragment Identifiers

The fragment identifier mechanism defined in [IETF RFC 2396] is specifically targeted at the client side of Internet data handling, that is, user agents. A client-side application is handed a whole resource, within which it needs to find the appropriate fragment (subresource) in order to continue its processing. (See C Scenarios for Pointing into XML for more information on scenarios where this action is desirable.)

It is likely that many XML-based applications in this position will be lightweight, especially given the trend towards Web-enabled PDAs and mobile phones. A fragment identifier language for XML data, just as for any Internet media type, is likelier to be widely deployed and interoperable if it requires a more modest investment of resources.

Thus, this proposal offers an XML fragment identifier design that can be:

  • Easily and quickly implemented

  • Tested easily for conformance

  • Efficiently implemented on small devices

  • Implemented using streaming technology

  • Neatly integrated as a small module into a variety of higher-level applications of varying sophistication

Specifically, this proposal takes the view that any features of a fragment identifier language that can be characterized as general-purpose "querying" for information rather than "pointing" to known information place too much of a burden on FIXptr processors and applications that depend on them.

It is worth noting that there is only one implementation of XPointer that is known to be complete, and no vendors have adopted and deployed it. Most implementations are partial: either bare names (sometimes with child sequences), or plain XPath without the XPointer data model extensions. No vendor today claims XPointer conformance, four years into XPointer's development. Since new technologies tend to be deployed to user agents over a period of several years, a modest design seems much more likely to be widely adopted and correctly implemented than an ambitious one.

B.2 Data Model

The design offered in this proposal is based on the [Infoset], rather than the XPath data model, for the following reasons:

  • The Infoset is explicitly designed for the purpose for providing an unambiguous description of XML data, which makes it easy to express precisely what subresource is being pointed to. The choice of the Infoset lets both this proposal and higher-level application specifications use concise, formal language in describing the desired behavior.

  • The Infoset natively allows character-by-character access. Locating text content is particularly important for hypertext and annotation, two important motivations for an XML pointer language.

  • If a document is governed by a schema whose validation process populates the [attribute type] property of the Infoset, element IDs can be recognized even when DTDs are not used.

  • Every XML document has an infoset, regardless of the document's use of external parsed entities to break up its physical structure. Thus, the Infoset provides a bridge between the world of Internet resources and Internet entity bodies, on the one hand, and the world of XML documents and XML entities, on the other.

Because the Infoset does not define infosets for external parsed entities all by themselves, this proposal accounts only for addressing into Internet media types text/xml and application/xml, not text/xml-external-parsed-entity and application/xml-external-parsed-entity . The relationship between physical Internet resources and logical XML documents needs more study before it can be completely harmonized; until that time, other mechanisms can be used to handle arbitrary external parsed entities.

B.3 Feature Set

This proposal takes the view that a very minimal fragment identifier language supports the bulk of pointing needs. We make the following observations:

  • Most linking applications, such as Fujitsu's [Fujitsu] , use [XPointer] only to point to elements. No real-world examples have been found of XPointers that address attributes, comments, processing instructions, or namespaces.

  • Linking applications have typically implemented only bare names and (in some cases) child sequences, and not the more sophisticated ways of pointing to elements. SVG [SVG], which borrows from XPointer to define its own fragment identifier language, borrows only two aspects from XPointer: bare names and the equivalent XPath-compatible id function.

  • XPointers are typically generated, not hand-coded, and generated XPointers tend not to take advantage of built-in knowledge of the schema being used in order to avoid future breakage. The generation algorithm tends to be: use the element's ID; if none, go up to the nearest ancestor, use its ID, then walk down with tumblers; walk down all the way from the root if necessary. (In fact, this algorithm is recommended in the Synthesizing XPointers section of the XLink-to-RDF note [XLink2RDF] for purposes of standardizing on a way to create consistent RDF statements from XLink links.)

These observations suggest strongly that the main requirement is to address elements, which is not surprising since elements are the primary objects in XML documents. This proposal keeps the two most well-supported ways to point to elements in XPointer, using syntax identical to XPointer's bare name and child sequence short forms. It has been suggested to allow for element naming (so that you could, for example, locate the fifth p element, not just the fifth element overall). However, this would require quite a lot more infrastructure to accommodate namespace-qualified elements, without providing equivalent benefit. For example, without specialized schema knowledge, naming elements is no more likely to protect machine-generated addresses against pointer breakage. Thus, it seems to fall beyond the 80/20 point.

The next most important class of objects in XML documents is the text contained inside elements. The need for pointing to pieces of text can be demonstrated by the following observations:

  • Most HTML links have, as their starting resource (using XLink terminology), a word or phrase that is undistinguished except for the <A HREF=> markup itself.

  • XLink provides an important facility, third-party links, that allows such words and phrases to be identified remotely, even in the absence of surrounding markup.

This proposal provides a way to point to individual characters using the simple mechanism of counting characters that are children of the current element, analogous to counting elements in child sequences.

See B.4 Doing Ranges with FIXptr for information on "pointing" to ranges of characters.

Designing an XML fragment identifier language to point to elements and characters seems the right 80/20 point assuming the goals stated in B.1 The Role of Fragment Identifiers . Putting the burden of deriving ranges and executing queries onto just the higher-level applications that need these capabilities seems the better architectural solution.

B.4 Doing Ranges with FIXptr

The design originally proposed did not natively support pointer pairs; it was expected that applications would define data formats that (according to their preferences) allowed or did not allow pairs at a higher level, by supplying a pair of whole URI references. However, this had the effects of requiring the URI portion to be supplied once for each member of the pair, with concomitant error conditions if the URI portions did not match, and the potential expense of accessing the same resource twice.

This version of the proposal offers a native syntax for supplying pointer pairs in order to avoid these problems. However, the precise semantics of the pair (such as deriving a range), beyond the mere identification of two information items, is still left to the application level. There are several reasons for this:

  • Range derivation is potentially a complex and difficult task. Pushing complexity to higher layers of processing is desirable in achieving the goals in B.1 The Role of Fragment Identifiers. In addition, merely locating the two information items is useful all by itself, particularly for unsophisticated streaming-oriented applications. For example, a display application could insert a typographical "bracket" indication before the first item and after the second item. This suggests that the FIXptr level of functionality is interesting on its own and yet compact enough to encourage deployment.

  • Different applications might need to derive different ranges from the same pointer pair. For example, while an XLink application might be able to handle an unbalanced range derived from a pointer pair such as /1/2/5,/1/3/17(2) , an XInclude application given the same pointer pair certainly would not. (XInclude could choose to make unbalanced pairs an application error, or derive an acceptable set of information items such as a "covering range." In fact, the latter is what XInclude does today with XPointer.)

    As another example, in the following XML document:

    <?xml version="1.0"?>
    <p>A <em>big</em> tree.</p>

    The word "A" is at /1(1) and the letter "t" is at /1(4). Each application will face various choices as to how to handle the subelement em and its content (include the whole subelement, include just its character information items, ignore the entire subelement); none is obviously the "right" answer.

    Interoperability might appear to be a concern in this case. However, while interoperability among XLink applications and, separately, among XInclude applications is surely important (hence this proposal's recommendation about documenting the derivation), equality of range derivation between XLink and XInclude applications seems unnecessary. In practical terms, equality would require a fragment identifier transplanted from an xlink:href= attribute to an xinclude:xinclude href= attribute to have the same derivation. Not only is this is probably an unrealistic scenario, but if it were realistic, we would probably expect different behavior for the XLink and XInclude renditions of the fragment identifier.

C Scenarios for Pointing into XML (Non-Normative)

As defined in [IETF RFC 2396], an Internet media type identifies a certain class of resources, and resources of that type can have a fragment identifier language associated with them. HTML is the only widely supported Internet media type that has a fragment identifier language, the familiar #name construction. Since HTML's primary associated application is the browser, pointers into HTML fragments tend to appear mostly in HTML documents, and they tend to be used mostly for the hypertext scenario: You point into a particular location in an HTML file because you want the browser to scroll to that point before displaying the page.

However, it is possible to separate the hyperlinking and pointing functions, as XLink and XPointer have demonstrated. This is important because XML processing is likely to include applications that do not do hyperlinking but nonetheless want to do pointing (such as XInclude).

Following are some examples of ways in which various applications might interact with FIXptr processors. They all assume the following target XML document, in a file called footspec.xml:

<?xml version="1.0"?>
<!DOCTYPE spec [
<title>Specification for the Footwear Manufacturers' Markup Language</title>
<p>In this introudction, we list the scope of FMML:</p>
<li><p>Footwear sizes</p></li>
<li><p>Footwear prices</p></li>
<li><p>Footwear colors</p></li>
<issue id="scope-update">check this list against the charter!</issue>

C.1 HTML Hyperlinking into an XML Document

An HTML document contains a link to the introduction of the FMML specification that looks like this:

<P>The <A HREF="footspec.xml#/1/2">introduction to FMML</A> describes
the scope of that language.</P>

The pointing is accomplished with a child sequence, because the section in question doesn't have an ID on it. If the link were into an HTML document, there would have had to be an <A NAME=> available.

Also notable is that a pointer into an equivalent HTML document wouldn't have been able to point to the desired content, only a position near the content of interest, in order to get user agents to scroll to that position. In this case, the actual content (the element containing the whole section) is identified.

To date, HTML hyperlinking and pointing has tended to be implemented as a monolithic system, rather than as a modular part of Internet architecture. Thus, browsers are probably not prepared to handle HTML hyperlinks containing URIs-plus-fragment identifiers to any media types other than HTML itself. In the absence of any HTML specification that describes proper behavior for hyperlinking into XML, browsers that support both HTML and XML could implement some kind of default behavior on encountering an HTML link into an XML document, which has the option of being a bit more sophisticated than just scrolling -- for example, highlighting or typographically bracketing the section.

Note that it is not the FIXptr processor that does the highlighting or scrolling; this would be done by the browser application, which has been directed to the element information item of interest by the FIXptr processor.

C.2 Compiling Issues into an Issue List

XInclude is used to populate an FMML "issue list" document with all the issue elements that appear in the actual specification.

<?xml version="1.0"?>
<issues-list xmlns:xinclude="">
<title>FMML Issues</title>
<p>The following are open issues.  Please be prepared to discuss them
at the next FMML meeting.
<xinclude:xinclude href="footspec.xml#scope-update" />

Because issue elements can be counted on to have IDs and because the specification document directly identifies which attributes are of type ID, the corresponding FIXptrs can use them, and the issues list will typically need to be updated only when issues are added or deleted.

Here, it is the XInclude application that provides the transformation capability. The FIXptr processor is handed the infoset for the whole specification document each time an XInclude instruction is encountered, and it merely locates each of the desired elements in turn.

If the document did not directly contain the ATTLIST that identifies the attribute with type ID, but rather had a DTD external subset with the ATTLIST, the person or software component responsible for creating the FIXptr string in each XInclude instruction would have to decide whether it is "safe" to use an ID to point to the desired elements; not all infoset-producing XML parsers read external subsets. If using an ID is deemed unsafe, a child sequence would have to be used instead.

C.3 Annotating a Document with Corrections

A specialized annotation language is used to record a reviewer's corrections to the specification (using a fictitious Review Tool), so that the editor can use the corresponding fictitious Update Tool application to accept or reject them. The annotations look like this:


The Update Tool might require that both pointers in the pair specify character information items and not element information items, because reviewers are allowed to suggest only textual changes and not markup changes. However, it might still allow unbalanced character pairs, with an interpretation of ranges that corresponds to all elements "touched," plus all the characters to the "right" of the first FIXptr in the pair and to the "left" of the second. (In practice, a much more precise accounting of these semantics and the Update Tool interface would be needed!)

Here, the Review Tool is a producer of FIXptr strings and the Update Tool is an application that uses a FIXptr processor (which consumes the strings created by the Review Tool). The Review Tool might need to be sensitive to bidirectional data and other character-handling issues, but these concerns are outside the scope of FIXptr itself as long as the Review Tool and the FIXptr processor are both operating on data that has been normalized according to [CharMod].

D Production Notes (Non-Normative)

This proposal was encoded in the XMLspec DTD [XMLspec]. The HTML version was produced with an XSLT stylesheet.