XPointer [XPointer] is intended to be the definition of the fragment identifier language for the four main XML-related Internet media types [IETF RFC 3023]. As such, it has to be parsed whenever there is a fragment identifier on a URI that refers to an XML resource. Because fragment identifier processing is such a basic part of XML on the Web and needs to be efficiently supportable on a wide range of devices and applications, this proposal suggests a small subset of XPointer functionality, called FIXptr, for this purpose.
This document has no official status at this time. It is a proposal offered for consideration as a replacement or minimal conformance level for XPointer. (Note that the XML Linking Working Group took up this question on 19 April 2001 and voted down the proposal.)
B Rationale describes the reasoning behind the proposed design.
1 Introduction
2 Terminology
3 Conformance
4 FIXptr Language and Processing
4.1 Syntax
4.2 Informal
Semantics
4.3 Semantics in Terms of
the Infoset
4.3.1 Name Component
4.3.2 Initial Child Component
4.3.3 Child Component
4.3.4 Character Offset Component
4.3.5 Pointer Pair
A References
A.1 Normative
References
A.2 Informative
References
B Rationale (Non-Normative)
B.1 The Role of Fragment
Identifiers
B.2 Data Model
B.3 Feature
Set
B.4 Doing Ranges with
FIXptr
C Scenarios for Pointing into XML
(Non-Normative)
C.1 HTML Hyperlinking into
an XML Document
C.2 Compiling Issues into
an Issue List
C.3 Annotating a Document
with Corrections
D Production Notes (Non-Normative)
XPointer [XPointer] is intended to be the definition of the fragment identifier language for the four main XML-related Internet media types [IETF RFC 3023]. As such, it has to be parsed whenever there is a fragment identifier on a URI that refers to an XML resource. Because fragment identifier processing is such a basic part of XML on the Web and needs to be efficiently supportable on a wide range of devices and applications, this proposal suggests a small subset of XPointer functionality, called FIXptr, for this purpose.
This proposal specifies the addressing of any single element or character in the resource, or a pair of elements and/or characters. Addressing is done with three constructs:
an optional initial ID,
a "child sequence" or "tumbler" that gives a path that descends the element tree,
and an optional terminal character offset.
The language defined in this proposal offers a subset of the features that XPointer offers. To aid in clear description, this language is given a different name: Fragment Identifier for XML (FIXptr). FIXptr is specified in terms of the XML Information Set [Infoset].
B Rationale describes the reasoning behind the proposed design. C Scenarios for Pointing into XML provides some examples of how it would work.
[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended , may, and optional in this specification are to be interpreted as described in [IETF RFC 2119].]
A FIXptr string that locates a single information item. This proposal normatively defines the syntax of pointers.
A FIXptr-conforming string containing two pointers that each locate a single information item. This proposal normatively defines the syntax of pointer pairs.
A software component that takes as input an infoset and a FIXptr pointer or pointer pair and produces as output an identification of exactly one or two information items in that infoset. This proposal normatively defines the behavior of a FIXptr processor.
A software component that incorporates or uses a FIXptr processor because it needs to access XML resources by means of URI references. The occurrence and usage of URI references are governed by the definition of each application's corresponding data format (which could be XML-based or non-XML-based). For example, both HTML Web browsers and XInclude [XInclude] processors are applications.
A violation of the rules of this specification; results are undefined.
The information set of a document as defined by the XML Information Set specification [Infoset].
An item of information in a document's infoset.
The information item that is located by one component of a pointer, and relied on as the context for locating the next (rightward) component of the pointer.
FIXptr processing normatively depends on [IETF RFC 2396] (as updated by [IETF RFC 2732]) processing, including character escaping as defined in these RFCs.
FIXptr processing normatively expects as input infosets that have at least the following information items and properties:
[document element] property
[attributes] property
[children] property
[attribute type] property
[normalized value] property
It is an error if a FIXptr string does not adhere to the syntactic requirements described in this specification.
A fragment identifier conforms to this specification if it does not satisfy any of the conditions for being in error as described in this specification.
Conforming FIXptr processors should report errors back to the application. Applications may terminate or recover from FIXptr errors in any way they choose.
This section describes the FIXptr language and the behavior of FIXptr processors.
All FIXptr strings match the following production (where Name is as defined in the XML Recommendation [XML]):
[1] | fixptr |
::= | ptr (',' ptr)? |
[2] | ptr |
::= | (Name
| init-child) child* char-offset? |
[3] | init-child |
::= | '/1' |
[4] | child |
::= | '/' [1-9] [0-9]* |
[5] | char-offset |
::= | '(' [1-9] [0-9]* ')' |
In other words, the name, child sequence, and character offset
are all optional, but if the name is omitted, the child sequence must be non-null and must start with
/1
.
When the FIXptr string contains a single pointer, the optional initial Name locates the element in the document that has an ID-typed attribute whose value is the given name. The following child sequence locates an element by stepwise navigation using a sequence of integers separated by slashes (/); each integer n locates the nth child element of the previously located element. The final optional parenthesized integer gives a character offset; an offset n locates the nth child character node of the previously located element.
For example, the pointer intro/3/1/4(6)
identifies
a character as follows: It first locates the element with an ID
attribute that has the value "intro", then locates the third child
element of the "intro" element, then locates that element's first
child element, then that element's fourth child element, then that
element's sixth character child. Note that a pointer consisting of
just a name provides, for resources with XML media types, an analog
of HTML fragment identifier behavior.
When the string contains a pointer pair, each pointer identifies a
single element or character as described above. For example, the
pointer pair /1/5/12,/1/5/18
identifies the twelfth
and eighteenth child elements inside the fifth child element inside
the root element of the document. Applications may use the located content as anchors for
deriving a larger selection of content (such as a "range" of seven
elements encompassing the twelfth through the eighteenth inclusive)
on which they then operate.
A FIXptr string containing a pointer or a pointer pair specifies, respectively, one or two references into the content of a Web resource of Internet media type text/xml or application/xml using a URI reference [IETF RFC 2396], by serving as a fragment identifer for such a resource.
A FIXptr processor takes as input the infoset of the identified resource and a FIXptr string. It is an error for a FIXptr fragment identifier to appear in a URI reference that specifies a resource that does not have an XML infoset.
A FIXptr processor produces as output an identification of exactly one or two element information items and/or character information items in that infoset. For each pointer supplied as input, the information item identified is the item located by the final (rightmost) component of the pointer.
The rest of this section describes just which information items are identified by a given FIXptr string.
Note:
An application may use a data model that is richer than or different from the Infoset. For example, an editing application might interpret a pointer as corresponding to a cursor position between two elements, or a pointer pair as corresponding to a "user selection." In this situation, it is recommended that the normative documentation for an application specify the relationship between information items and the constructs of interest.
The Name component locates the unique element information item in the input infoset whose [attributes] property (which is an unordered set of attribute information items) contains an attribute information item whose [attribute type] property has the value "ID" and whose [normalized value] property matches the given Name. It is an error if no such element information item exists or if more than one such element information item exists.
Note:
Since the Name component could be any string that satisfies XML's Name production, it might need to contain characters that cannot be represented directly in the current document's encoding and/or that are not allowed to be represented in unescaped form in a URI reference. Therefore, though a string does satisfy the syntax requirements for FIXptr, it might not be usable in its unescaped form as a fragment identifier in some circumstances. URI reference escaping methods are described in [IETF RFC 2396] (as updated by [IETF RFC 2732]) . The W3C Character Model specification [CharMod] is also informative in this area.
If the Name component is omitted, then a pointer must start with the init-child component, which locates the unique element information item in the [document element] property of the infoset's document information item. It is an error if no such element information item exists.
Each child component locates an element information item by stepwise navigation using an unsigned positive integer preceded by a slash. For each integer n appearing in a child sequence component, the nth element information item in the [children] property of the currently located element information item (the context item) is located. For the integer in the first child, the context item is that one located by the Name or init-child component. It is an error if, for any integer n in the sequence, there is no n th element information item in the [children] property of the context item.
Note:
In locating the nth element information item in the [children] property of an element information item, all other information items within the [children] property are ignored and do not affect the counting.
The char-offset component consists of an unsigned positive integer enclosed in parentheses. For an integer n, the nth character information item in the [children] property of the context item is located. It is an error if no such character information item exists.
Note:
In locating the n th character information item in the [children] property of an element information item, all other information items within the [children] property are ignored and do not affect the counting.
Note:
The W3C Character Model specification [CharMod] recommends early character normalization. Given that character normalization has occurred prior to or during infoset generation, the process of locating the nth character information item in the [children] property of an element information item is unambiguously defined.
In the case of a pointer pair, the FIXptr string identifies two information items, one for each pointer.
Note:
An application may define pointer pair constraints (application-level errors) beyond the syntax imposed by this proposal. An application may also derive from the pointer pair a set of one or more information items or other constructs (such as a "range" based on the pointer pair as "starting" and "ending" points) on which the application will operate. It is recommended that the normative documentation for an application or its corresponding data format specify the constraints it imposes and its intended derivation.
This section provides reasons supporting the choices of language scope and design made in this proposal.
The fragment identifier mechanism defined in [IETF RFC 2396] is specifically targeted at the client side of Internet data handling, that is, user agents. A client-side application is handed a whole resource, within which it needs to find the appropriate fragment (subresource) in order to continue its processing. (See C Scenarios for Pointing into XML for more information on scenarios where this action is desirable.)
It is likely that many XML-based applications in this position will be lightweight, especially given the trend towards Web-enabled PDAs and mobile phones. A fragment identifier language for XML data, just as for any Internet media type, is likelier to be widely deployed and interoperable if it requires a more modest investment of resources.
Thus, this proposal offers an XML fragment identifier design that can be:
Easily and quickly implemented
Tested easily for conformance
Efficiently implemented on small devices
Implemented using streaming technology
Neatly integrated as a small module into a variety of higher-level applications of varying sophistication
Specifically, this proposal takes the view that any features of a fragment identifier language that can be characterized as general-purpose "querying" for information rather than "pointing" to known information place too much of a burden on FIXptr processors and applications that depend on them.
It is worth noting that there is only one implementation of XPointer that is known to be complete, and no vendors have adopted and deployed it. Most implementations are partial: either bare names (sometimes with child sequences), or plain XPath without the XPointer data model extensions. No vendor today claims XPointer conformance, four years into XPointer's development. Since new technologies tend to be deployed to user agents over a period of several years, a modest design seems much more likely to be widely adopted and correctly implemented than an ambitious one.
The design offered in this proposal is based on the [Infoset], rather than the XPath data model, for the following reasons:
The Infoset is explicitly designed for the purpose for providing an unambiguous description of XML data, which makes it easy to express precisely what subresource is being pointed to. The choice of the Infoset lets both this proposal and higher-level application specifications use concise, formal language in describing the desired behavior.
The Infoset natively allows character-by-character access. Locating text content is particularly important for hypertext and annotation, two important motivations for an XML pointer language.
If a document is governed by a schema whose validation process populates the [attribute type] property of the Infoset, element IDs can be recognized even when DTDs are not used.
Every XML document has an infoset, regardless of the document's use of external parsed entities to break up its physical structure. Thus, the Infoset provides a bridge between the world of Internet resources and Internet entity bodies, on the one hand, and the world of XML documents and XML entities, on the other.
Because the Infoset does not define infosets for external parsed
entities all by themselves, this proposal accounts only for
addressing into Internet media types text/xml
and
application/xml
, not
text/xml-external-parsed-entity
and
application/xml-external-parsed-entity
. The
relationship between physical Internet resources and logical XML
documents needs more study before it can be completely harmonized;
until that time, other mechanisms can be used to handle arbitrary
external parsed entities.
This proposal takes the view that a very minimal fragment identifier language supports the bulk of pointing needs. We make the following observations:
Most linking applications, such as Fujitsu's [Fujitsu] , use [XPointer] only to point to elements. No real-world examples have been found of XPointers that address attributes, comments, processing instructions, or namespaces.
Linking applications have typically implemented only bare names
and (in some cases) child sequences, and not the more sophisticated
ways of pointing to elements. SVG [SVG], which
borrows from XPointer to define its own fragment identifier
language, borrows only two aspects from XPointer: bare names and
the equivalent XPath-compatible id
function.
XPointers are typically generated, not hand-coded, and generated XPointers tend not to take advantage of built-in knowledge of the schema being used in order to avoid future breakage. The generation algorithm tends to be: use the element's ID; if none, go up to the nearest ancestor, use its ID, then walk down with tumblers; walk down all the way from the root if necessary. (In fact, this algorithm is recommended in the Synthesizing XPointers section of the XLink-to-RDF note [XLink2RDF] for purposes of standardizing on a way to create consistent RDF statements from XLink links.)
These observations suggest strongly that the main requirement is
to address elements, which is not surprising since elements are the
primary objects in XML documents. This proposal keeps the two most
well-supported ways to point to elements in XPointer, using syntax
identical to XPointer's bare name and child sequence short forms.
It has been suggested to allow for element naming (so that you
could, for example, locate the fifth p
element, not
just the fifth element overall). However, this would require quite
a lot more infrastructure to accommodate namespace-qualified
elements, without providing equivalent benefit. For example,
without specialized schema knowledge, naming elements is no more
likely to protect machine-generated addresses against pointer
breakage. Thus, it seems to fall beyond the 80/20 point.
The next most important class of objects in XML documents is the text contained inside elements. The need for pointing to pieces of text can be demonstrated by the following observations:
Most HTML links have, as their starting resource (using XLink
terminology), a word or phrase that is undistinguished except for
the <A HREF=>
markup itself.
XLink provides an important facility, third-party links, that allows such words and phrases to be identified remotely, even in the absence of surrounding markup.
This proposal provides a way to point to individual characters using the simple mechanism of counting characters that are children of the current element, analogous to counting elements in child sequences.
See B.4 Doing Ranges with FIXptr for information on "pointing" to ranges of characters.
Designing an XML fragment identifier language to point to elements and characters seems the right 80/20 point assuming the goals stated in B.1 The Role of Fragment Identifiers . Putting the burden of deriving ranges and executing queries onto just the higher-level applications that need these capabilities seems the better architectural solution.
The design originally proposed did not natively support pointer pairs; it was expected that applications would define data formats that (according to their preferences) allowed or did not allow pairs at a higher level, by supplying a pair of whole URI references. However, this had the effects of requiring the URI portion to be supplied once for each member of the pair, with concomitant error conditions if the URI portions did not match, and the potential expense of accessing the same resource twice.
This version of the proposal offers a native syntax for supplying pointer pairs in order to avoid these problems. However, the precise semantics of the pair (such as deriving a range), beyond the mere identification of two information items, is still left to the application level. There are several reasons for this:
Range derivation is potentially a complex and difficult task. Pushing complexity to higher layers of processing is desirable in achieving the goals in B.1 The Role of Fragment Identifiers. In addition, merely locating the two information items is useful all by itself, particularly for unsophisticated streaming-oriented applications. For example, a display application could insert a typographical "bracket" indication before the first item and after the second item. This suggests that the FIXptr level of functionality is interesting on its own and yet compact enough to encourage deployment.
Different applications might need to derive different ranges
from the same pointer pair. For example, while an XLink application
might be able to handle an unbalanced range derived from a pointer
pair such as /1/2/5,/1/3/17(2)
, an XInclude
application given the same pointer pair certainly would not.
(XInclude could choose to make unbalanced pairs an application
error, or derive an acceptable set of information items such as a
"covering range." In fact, the latter is what XInclude does today
with XPointer.)
As another example, in the following XML document:
<?xml version="1.0"?> <p>A <em>big</em> tree.</p> |
The word "A" is at /1(1)
and the letter "t" is at
/1(4)
. Each application will face various choices as
to how to handle the subelement em
and its content
(include the whole subelement, include just its character
information items, ignore the entire subelement); none is obviously
the "right" answer.
Interoperability might appear to be a concern in this case.
However, while interoperability among XLink applications
and, separately, among XInclude applications is surely
important (hence this proposal's recommendation about documenting
the derivation), equality of range derivation between
XLink and XInclude applications seems unnecessary. In practical
terms, equality would require a fragment identifier transplanted
from an xlink:href=
attribute to an
xinclude:xinclude href=
attribute to have the same
derivation. Not only is this is probably an unrealistic scenario,
but if it were realistic, we would probably expect
different behavior for the XLink and XInclude renditions of the
fragment identifier.
As defined in [IETF RFC 2396], an
Internet media type identifies a certain class of resources, and
resources of that type can have a fragment identifier language
associated with them. HTML is the only widely supported Internet
media type that has a fragment identifier language, the familiar
#name
construction. Since HTML's primary associated
application is the browser, pointers into HTML fragments tend to
appear mostly in HTML documents, and they tend to be used mostly
for the hypertext scenario: You point into a particular location in
an HTML file because you want the browser to scroll to that point
before displaying the page.
However, it is possible to separate the hyperlinking and pointing functions, as XLink and XPointer have demonstrated. This is important because XML processing is likely to include applications that do not do hyperlinking but nonetheless want to do pointing (such as XInclude).
Following are some examples of ways in which various
applications might interact with FIXptr processors. They all assume
the following target XML document, in a file called
footspec.xml
:
<?xml version="1.0"?> <!DOCTYPE spec [ <!ATTLIST issue id ID #REQUIRED> ]> <spec> <title>Specification for the Footwear Manufacturers' Markup Language</title> <div1><title>Introduction</title> <p>In this introudction, we list the scope of FMML:</p> <ulist> <li><p>Footwear sizes</p></li> <li><p>Footwear prices</p></li> <li><p>Footwear colors</p></li> </ulist> <issue id="scope-update">check this list against the charter!</issue> </div1> ... </spec> |
An HTML document contains a link to the introduction of the FMML specification that looks like this:
<P>The <A HREF="footspec.xml#/1/2">introduction to FMML</A> describes the scope of that language.</P> |
The pointing is accomplished with a child sequence, because the
section in question doesn't have an ID on it. If the link were into
an HTML document, there would have had to be an <A
NAME=>
available.
Also notable is that a pointer into an equivalent HTML document wouldn't have been able to point to the desired content, only a position near the content of interest, in order to get user agents to scroll to that position. In this case, the actual content (the element containing the whole section) is identified.
To date, HTML hyperlinking and pointing has tended to be implemented as a monolithic system, rather than as a modular part of Internet architecture. Thus, browsers are probably not prepared to handle HTML hyperlinks containing URIs-plus-fragment identifiers to any media types other than HTML itself. In the absence of any HTML specification that describes proper behavior for hyperlinking into XML, browsers that support both HTML and XML could implement some kind of default behavior on encountering an HTML link into an XML document, which has the option of being a bit more sophisticated than just scrolling -- for example, highlighting or typographically bracketing the section.
Note that it is not the FIXptr processor that does the highlighting or scrolling; this would be done by the browser application, which has been directed to the element information item of interest by the FIXptr processor.
XInclude is used to populate an FMML "issue list" document with all the issue elements that appear in the actual specification.
<?xml version="1.0"?> <issues-list xmlns:xinclude="http://www.w3.org/1999/XML/xinclude"> <title>FMML Issues</title> <p>The following are open issues. Please be prepared to discuss them at the next FMML meeting. </p> <xinclude:xinclude href="footspec.xml#scope-update" /> ... </issues-list> |
Because issue elements can be counted on to have IDs and because
the specification document directly identifies which attributes are
of type ID
, the corresponding FIXptrs can use them,
and the issues list will typically need to be updated only when
issues are added or deleted.
Here, it is the XInclude application that provides the transformation capability. The FIXptr processor is handed the infoset for the whole specification document each time an XInclude instruction is encountered, and it merely locates each of the desired elements in turn.
If the document did not directly contain the
ATTLIST
that identifies the attribute with type
ID
, but rather had a DTD external subset with the
ATTLIST
, the person or software component responsible
for creating the FIXptr string in each XInclude instruction would
have to decide whether it is "safe" to use an ID to point to the
desired elements; not all infoset-producing XML parsers read
external subsets. If using an ID is deemed unsafe, a child sequence
would have to be used instead.
A specialized annotation language is used to record a reviewer's corrections to the specification (using a fictitious Review Tool), so that the editor can use the corresponding fictitious Update Tool application to accept or reject them. The annotations look like this:
<annotations> <annotation> <problem-loc>footspec.xml#/1/2/2(9),/1/2/2(20)</problem-loc> <replace-with>introduction</replace-with> </annotation> ... </annotations> |
The Update Tool might require that both pointers in the pair specify character information items and not element information items, because reviewers are allowed to suggest only textual changes and not markup changes. However, it might still allow unbalanced character pairs, with an interpretation of ranges that corresponds to all elements "touched," plus all the characters to the "right" of the first FIXptr in the pair and to the "left" of the second. (In practice, a much more precise accounting of these semantics and the Update Tool interface would be needed!)
Here, the Review Tool is a producer of FIXptr strings and the Update Tool is an application that uses a FIXptr processor (which consumes the strings created by the Review Tool). The Review Tool might need to be sensitive to bidirectional data and other character-handling issues, but these concerns are outside the scope of FIXptr itself as long as the Review Tool and the FIXptr processor are both operating on data that has been normalized according to [CharMod].
This proposal was encoded in the XMLspec DTD [XMLspec]. The HTML version was produced with an XSLT stylesheet.