[contents]
Copyright © 2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
In general, we can think of content as any artifact that is meant to be rendered within or processed by a software application (like an editor) or runtime environment (like a web browser) so that the artifact is readily comprehensible and usable by human consumers. Content includes text, documents in a variety of formats such as the Open Document Format (ODF) or Portable Document Format (PDF), audio or video clips, and, of course, web content. Web content (e.g. XML, XHTML, and HTML documents) usually has a structure that allows identifying portions of the document in many ways. When referring to a specific part of a document, it is useful to be able to have a consistent manner by which to refer to a particular segment of a document, to have a variety of ways by which to refer to that same segment, and to make the reference robust in the face of changes to the document. This specification will contain a framework for representing pointers - entities that permit identifying a portion or segment of a piece of content - making use of the Resource Description Framework (RDF). It will also pose a number of speicific types of pointers that permit portions of a document to be referred to in different ways.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
[Editor's note: describe intent of this working draft and propose feedback questions]
Please send comments to the mailing list of the ERT WG. The archives for this list are publicly available.
This is an Editor's draft of the Pointers Vocabulary in RDF. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document has been produced by the Evaluation and Repair Tools Working Group as part of the WAI Technical Activity.
This specification introduces an RDF vocabulary to enable portions of or segments within documents (particularly HTML and XML documents) to be identified in an accurate and consistent way. At the same time, the classes and properties that comprise the vocabulary introduced by this specification permit a variety of methods for pointing to or identifying portions of a document (or content in general).
The namespace for Pointer Methods in RDF as specified in this draft is http://www.w3.org/2008/pointers#
[Editor's note: final namespace decision pending] and uses the ptr
prefix. Other namespaces typically used by Pointer Methods in RDF include the following:
cnt
http://www.w3.org/2008/content#
described in [Content]rdf
http://www.w3.org/1999/02/22-rdf-syntax-ns#
described in [RDF]rdfs
http://www.w3.org/2000/01/rdf-schema#
described in [RDFS]The keywords must, required, recommended, should, may, and optional are used in accordance with [RFC2119].
One motivation for this vocabulary stems from methods for reporting test results such as the Evaluation and Report Language (EARL) [EARL] but this need not be its only application. Other typical applications could include: [Editor's note: need to be completed]
This list is not meant to be exhaustive. This vocabulary is extensible, providing for alternative or enhanced methods for referring to portions of content and for referring to a variety of content types.
Pointer Methods in RDF is defined as an RDF vocabulary. The Resource Description Framework (RDF) is a general-purpose language for describing information in a way that is machine-understandable. The examples will be serialized with the abbreviated RDF/XML notation.
This document assumes the following background knowledge:
Pointer - a generic class for pointing to parts, segments, or portions of content. It is an abstract class that is generally intended to be subclassed, and every other type of pointer must be a ptr:Pointer
subclass. While this generic class can be used directly, one of the more specific refinements contained in this document should be used instead.
Properties defined by this document:
Pointers Group - a generic container for a group of pointers without any specific relationship between them.
Properties defined by this document:
While the generic PointersGroup
class can be used directly, one of the following more specific refinements should be used instead to provide more information about the existing relationship between the group members:
ptr:RelatedPointers
ptr:EquivalentPointers
Example 1: A series of pointers grouped using a PointersGroup
resource.
<ptr:PointersGroup rdf:about="#group"> <ptr:pointer rdf:resource="#pointer1"/> <ptr:pointer rdf:resource="#pointer2"/> ... </ptr:PointersGroup>
Related Pointers - a group of related pointers to be grouped together for some purpose, indicating that the group of pointers (presumably pointing to different parts of the document) have some relationship because they have a meaning as a whole. This is a subclass of the PointersGroup
class
Properties defined by this document:
Example 2: A group of related pointers.
<ptr:RelatedPointers rdf:about="#relatedGroup"> <ptr:pointer rdf:resource="#relatedPointer1"/> <ptr:pointer rdf:resource="#relatedPointer2"/> ... </ptr:RelatedPointers>
Equivalent Pointers - a group of pointers that point simoultaneously to the same part of the document, so that they can be considered equivalent. Put another way, each pointer in a set of pointers that are identified as equivalent must identify or pick out the same portion or segment of a piece of content. This is a subclass of the PointersGroup
class
In order to achieve the maximum level of flexibility and interoperability, it is recommended to provide as many equivalent pointers as possible for any particular case.
Properties defined by this document:
Example 3: A series of equivalent pointers.
<ptr:EquivalentPointers rdf:about="#equivalentGroup"> <ptr:pointer rdf:resource="#equivalentPointer1"/> <ptr:pointer rdf:resource="#equivalentPointer2"/> <ptr:pointer rdf:resource="#equivalentPointer3"/> ... </ptr:EquivalentPointers>
Single Pointer - a pointer that is intended to point to a unique part of a document. This is a generic single pointer that provides the necessary framework, but it does not provide any kind of pointer, so more specific subclasses must be used.
Properties defined by this document:
This vocabulary already provides several subclasses that refine the SinglePointer
class in relation to the way the pointer is defined.
ptr:ExpressionPointer
ptr:OffsetPointer
ptr:LineCharPointer
ptr:HTMLPointer
Example 4: A SinglePointer
resource.
<ptr:SinglePointer rdf:about="#singlePointer"> <ptr:reference rdf:resource="http://example.org/doc1.html"/> </ptr:SinglePointer>
Expression Pointer - a single pointer that makes use of expression languages to point out parts of a document. This is a generic expression pointer that could be subclassed for extensibility and more specific subclasses should be used where suitable.
Properties defined by this document:
This vocabulary already provides several subclasses of the ExpressionPointer
class depending on the language that is used to define the pointer expression.
ptr:XPathPointer
ptr:CSSSelectorPointer
Example 5: an ExpressionPointer
resource with version information.
<ptr:ExpressionPointer rdf:about="#expressionPointer"> <ptr:expression>some expression</ptr:expression> <ptr:version>x.y</ptr:version> <ptr:reference rdf:resource="http://example.org/doc1.html"/> </ptr:ExpressionPointer>
XPath Pointer - An expression pointer that makes use of XPath [XPath] expressions to point out parts of a document.
Properties defined by this document:
ptr:XMLNamespace
ptr:XPointerPointer
Example 6: An XPathPointer
resource with namespace reference.
<ptr:XPathPointer rdf:about="#xPathPointer"> <ptr:version>2.0</ptr:version> <ptr:expression>/html/body/div[@id='header']/img[1]</ptr:expression> <ptr:reference rdf:resource="http://example.org/doc1.html"/> <ptr:namespace rdf:resource="#xmlNamespace1"/> </ptr:XPathPointer>
XML Namespace - a namespace as defined by [Namespaces]. [Editor's note: need to decide if we want to reference namespaces in XML 1.0 or 1.1]
Properties defined by this document:
Example 7: An XMLNamespace
resource indicating prefix and name.
<ptr:XMLNamespace rdf:about="#XMLnamespace1"> <ptr:prefix>eg</ptr:prefix> <ptr:namespaceName>http://example.org/ns/</ptr:namespaceName> </ptr:XMLNamespace>
XPointer Pointer - an expression pointer that makes use of xpointer() scheme [XPointer-SCH] expressions to point out parts of a document.
Properties defined by this document:
Example 8: A XPointerPointer
resource.
<ptr:XPointerPointer rdf:about="#xPointerPointer"> <ptr:expression>string-range(//P,"Thomas Pynchon")[3],"P",1,0)</ptr:expression> <ptr:reference rdf:resource="http://example.org/doc1.html"/> </ptr:XPointerPointer>
CSS Selector Pointer - an expression pointer that points out parts of a document by means of a CSS expression.
Properties defined by this document:
Example 9: A CSSSelector
resource with version information.
<ptr:CSSSelectorPointer rdf:about="#cssSelectorPointer"> <ptr:reference rdf:resource="http://example.org/doc1.html"/> <ptr:expression>body > p#important</ptr:expression> <ptr:version>2.1</ptr:version> </ptr:CSSSelectorPointer>
Offset Pointer - a single pointer that points out parts of a document by means of an offset number counting from the start of the reference.
Properties defined by this document:
While the generic OffsetPointer
class can be used directly, one of the following more specific refinements should be used instead to provide more information about the type of offset that is being used.
ptr:CharOffsetPointer
ptr:ByteOffsetPointer
Example 2.10: An OffsetPointer
resource.
<ptr:OffsetPointer rdf:about="#offsetPointer"> <ptr:reference rdf:resource="http://example.org/doc1.html"/> <ptr:offset>33</ptr:offset> </ptr:OffsetPointer>
Char Offset Pointer - a single pointer that points out parts of a document by means of a character offset from the start of the reference.
Properties defined by this document:
Example 2.11: A CharOffsetPointer
resource.
<ptr:CharOffsetPointer rdf:about="#charOffsetPointer"> <ptr:reference rdf:resource="http://example.org/doc1.html"/> <ptr:offset>335</ptr:offset> </ptr:CharOffsetPointer>
Byte Offset Pointer - a single pointer that points out parts of a document by means of a byte offset from the start of the reference.
Properties defined by this document:
Example 2.12: A ByteOffsetPointer
resource.
<ptr:ByteOffsetPointer rdf:about="#byteOffsetPointer"> <ptr:reference rdf:resource="http://example.org/doc1.html"/> <ptr:offset>52</ptr:offset> </ptr:ByteOffsetPointer>
Line Char Pointer - a single pointer that points out parts of a document by means of the line number and character position where the target is localized. [Editor's note: need to consider make charNumber
optional]
Properties defined by this document:
Example 2.13: A LineCharPointer
resource.
<ptr:LineCharPointer rdf:about="#lineCharPointer"> <ptr:reference rdf:resource="http://example.org/doc1.html"/> <ptr:lineNumber>5</ptr:lineNumber> <ptr:charNumber>18</ptr:charNumber> </ptr:LineCharPointer>
HTML Pointer - a single pointer that point out parts of a document by means of a fuzzy pointer specially intended for not well-formed (x)html.
Properties defined by this document:
[Editor's note: This class is currently just a placeholder for the fuzzy pointer idea. The group does not have reached yet to a viable solution and it is open to any proposal]
Range Pointer - points to a range that identifies a well defined section within a document delimited by a begin and an end.
This is a generic range pointer that provides the necessary framework, but it does not constitute a complete range pointer, as it only defines the start point of the section. One of the more specific subclasses must be used instead.
[Editor's note: consider renaming of RangePointer
to CompoundPointer
for every range class, and redefinition to focus on Single versus Compound classes distinction, based on the number of pointers that are used.]
Properties defined by this document:
ptr:StartEndPointer
ptr:CharSnippetRangePointer
ptr:CharOffsetRangePointer
ptr:ByteSnippetRangePointer
ptr:ByteOffsetRangePointer
Start End Pointer - a range pointer pointing out parts of a document by means of a range delimited by a pair of single pointers that define the start point and an end point.
Properties defined by this document:
Example 2.14: A StartEndPointer
resource.
<ptr:StartEndPointer rdf:about="#startEndPointer"> <ptr:startPointer rdf:resource="#lineCharPointer"/> <ptr:endPointer rdf:resource="#charOffsetPointer"/> </ptr:StartEndPointer>
Char Snippet Range Pointer - a range pointer pointing out parts of a document by means of a range delimited by a single pointer that defines the start point and a character snippet from there.
Properties defined by this document:
Properties not defined by this document:
cnt:chars
Example 2.15: A CharSnippetRangePointer
resource.
<ptr:CharSnippetRangePointer rdf:about="#charSnippetRangePointer"> <ptr:startPointer rdf:resource="#charOffsetPointer"/> <cnt:chars><p>Some text.</p></cnt:chars> </ptr:CharSnippetRangePointer>
Char Offset Range Pointer - a range pointer pointing out parts of a document by means of a range delimited by a single pointer that defines the start point and a character offset from there.
Properties defined by this document:
Example 2.16: A CharOffsetRangePointer
resource.
<ptr:CharOffsetRangePointer rdf:about="#charOffsetRangePointer"> <ptr:startPointer rdf:resource="#XPathPointer"/> <ptr:charOffset>55</ptr:charOffset> </ptr:CharOffsetRangePointer>
Byte Snippet Range Pointer - a range pointer pointing out parts of a document by means of a range delimited by a single pointer that defines the start point and a byte snippet from there.
Properties defined by this document:
Properties not defined by this document:
cnt:bytes
Example 2.17: A ByteSnippetRangePointer
resource.
<ptr:ByteSnippetRangePointer rdf:about="#byteSnippetRangePointer"> <ptr:startPointer rdf:resource="#byteOffsetPointer"/> <cnt:bytes>R0lGODlhtQAxAOYAAKynpv3t4v3j1P/59ZuXlveYZ/vDovvMsfBbGWRiYf7+{...}</cnt:bytes> </ptr:ByteSnippetRangePointer>
Byte Offset Range Pointer - a range pointer pointing out parts of a document by means of a range delimited by a single pointer that defines the start point and a byte offset from there.
Properties defined by this document:
Example 2.18: A ByteOffsetRangePointer
resource.
<ptr:ByteOffsetRangePointer rdf:about="#byteOffsetRangePointer"> <ptr:startPointer rdf:resource="#byteOffsetPointer"/> <ptr:byteOffset>255</ptr:byteOffset> </ptr:ByteOffsetRangePointer>
A reference to a specific pointer.
A PointersGroup
will have one pointer
property per each of the pointers it contains. As any group of pointers must have one or more pointers, instances of the PointersGroup
class must have at least one instance of the pointer
property.
earl:PointersGroup
earl:Pointer
The document within which the pointer is applicable or meaningful.
A SinglePointer
will have exactly one reference
.
earl:SinglePointer
The language expression used as pointer.
An ExpressionPointer
will have exactly one expression
.
earl:ExpressionPointer
The expression language version used in the pointer expression.
An ExpressionPointer
will have at most one version
.
earl:ExpressionPointer
The namespace within an XPath expression operates.
earl:XPathPointer
Associates element and attribute names with a namespace URI.
An XMLNamespace
will have exactly one prefix
earl:XPathPointer
Identifies the namespace.
An XMLNamespace
will have exactly one namespace
earl:XMLNamespace
The target position counting from the start of the referenced document. The count will start at one in each document.
An OffsetPointer
will have exactly one offset
earl:OffsetPointer
The line number where the target is localized. The line count will start at one in each document.
A LineCharPointer
will have exactly one lineNumber
.
earl:LineCharPointer
The character number where the target is localized within a line. The character count will start at one in each line.
A LineCharPointer
will have exactly one charNumber
.
earl:LineCharPointer
The expression that is used to point to.
An HTMLPointer
pointer will have exactly one htmlPointer
.
earl:HTMLPointer
Reference to the pointer that defines the beginning point for a range.
A RangePointer
will have exactly one startPointer
.
earl:RangePointer
earl:SinglePointer
Reference to the pointer that defines the end point for a range.
A StartEndPointer
will have exactly one endPointer
.
earl:RangePointer
earl:SinglePointer
The position of the end of a range from a startPointer
, expressed by the number of characters that conform the range, and being the first character of the range that one indicated by the startPointer
.
A CharOffsetRangePointer
will have exactly one charOffset
.
earl:CharOffsetRangePointer
The position of the end of a range from a startPointer
, expressed by the number of bytes that conform the range, and being the first byte of the range that one indicated by the startPointer
.
A ByteOffsetRangePointer
will have exactly one byteOffset
.
earl:ByteOffsetRangePointer
The following terms are defined by this specification:
Class name | Label | Suggested types | Required properties | Optional properties |
---|---|---|---|---|
ptr:ByteOffsetPointer |
Byte Offset Pointer | ptr:offset , ptr:reference |
||
ptr:ByteOffsetRangePointer |
Byte Offset Range Pointer | ptr:byteOffset , ptr:startPointer |
||
ptr:ByteSnippetRangePointer |
Byte Snippet Range Pointer | ptr:byteSnippet , ptr:startPointer |
||
ptr:CharOffsetPointer |
Char Offset Pointer | ptr:offset , ptr:reference |
||
ptr:CharOffsetRangePointer |
Char Offset Range Pointer | ptr:charOffset , ptr:startPointer |
||
ptr:CharSnippetRangePointer |
Char Snippet Range Pointer | ptrt:charSnippet , ptr:startPointer |
||
ptr:CSSSelectorPointer |
CSS selector Pointer | ptr:expression , ptr:reference |
ptr:version |
|
ptr:EquivalentPointers |
Equivalent Pointers | ptr:pointer |
||
ptr:ExpressionPointer |
Expression Pointer | ptr:CSSSelectorPointer, ptr:XPathPointer, ptr:XPointerPointer | ptr:expression |
ptr:version |
ptr:HTMLPointer |
HTML Pointer | ptr:htmlPointer , ptr:reference |
||
ptr:LineCharPointer |
Line-Char Pointer | ptr:charNumber , ptr:lineNumber , ptr:reference |
||
ptr:Namespace |
Namespace | ptr:namespaceURI , ptr:prefix |
||
ptr:OffsetPointer |
Offset Pointer | ptr:ByteOffsetPointer, ptr:CharOffsetPointer | ptr:offset |
|
ptr:Pointer |
Pointer | ptr:PointersGroup, ptr:RangePointer, ptr:SinglePointer | ||
ptr:PointersGroup |
Pointers Group | ptr:EquivalentPointers, ptr:RelatedPointers | ptr:pointer |
|
ptr:RangePointer |
Range Pointer | ptr:startPointer |
||
ptr:RelatedPointers |
Related Pointers | ptr:pointer |
||
ptr:SinglePointer |
Single Pointer | ptr:ByteOffsetPointer, ptr:CharOffsetPointer, ptr:ExpressionPointer, ptr:HTMLPointer, ptr:LineCharPointer, | ptr:reference |
|
ptr:StartEndPointer |
Start-End Pointer | ptr:endPointer , ptr:startPointer |
||
ptr:XPathPointer |
XPath Pointer | ptr:expression , ptr:reference |
ptr:namespace , ptr:version |
|
ptr:XPointerPointer |
XPointer Pointer | ptr:expression , ptr:reference |
ptr:namespace , ptr:version |
Property name | Label | Domain | Range | Restriction |
---|---|---|---|---|
ptr:byteOffset |
char offset | ptr:ByteOffsetRangePointer |
Positive Integer | Exactly one per ptr:ByteOffsetRangePointer |
ptr:charNumber |
char number | ptr:LineCharPointer |
Positive Integer | Exactly one per ptr:LineCharPointer |
ptr:charOffset |
char offset | ptr:CharOffsetRangePointer |
Positive Integer | Exactly one per ptr:CharOffsetRangePointer |
ptr:endPointer |
end pointer | ptr:StartEndPointer |
ptr:SinglePointer |
Exactly one per ptr:StartEndPointer |
ptr:expression |
expression | ptr:ExpressionPointer |
Literal | Exactly one per ptr:ExpressionPointer |
ptr:htmlPointer |
html pointer | ptr:HTMLPointer |
Exactly one per ptr:HTMLPointer |
|
ptr:lineNumber |
line number | ptr:LineCharPointer |
Positive Integer | Exactly one per ptr:LineCharPointer |
ptr:namespace |
namespace | ptr:XPathPointer |
||
ptr:namespaceURI |
namespace URI | ptr:Namespace |
Exactly one per ptr:Namespace |
|
ptr:offset |
offset | ptr:OffsetPointer |
Positive Integer | Exactly one per ptr:OffsetPointer and ptr:StartOffsetPointer |
ptr:pointer |
pointer | ptr:PointersGroup |
ptr:Pointer |
At least one per ptr:PointerGroup |
ptr:prefix |
prefix | ptr:Namespace |
Literal | Exactly one per ptr:Namespace |
ptr:reference |
reference | ptr:SinglePointer |
Exactly one per ptr:SinglePointer |
|
ptr:startPointer |
start pointer | ptr:RangePointer |
ptr:SinglePointer |
Exactly one per ptr:RangePointer |
ptr:version |
version | ptr:ExpressionPointer |
Literal | At most one per ptr:ExpressionPointer |
http://www.w3.org/WAI/ER/Content/WD-Content-in-RDF-20080327
http://www.w3.org/TR/2007/WD-EARL10-Schema-20070323
http://www.w3.org/TR/2006/REC-xml-names11-20060816
http://www.w3.org/TR/owl-features/
http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/
http://www.w3.org/TR/rdf-primer/
http://www.w3.org/TR/rdf-schema/
http://www.w3.org/DesignIssues/RDF-XML
http://www.ietf.org/rfc/rfc2119.txt
http://www.w3.org/TR/xml/
http://www.w3.org/TR/1999/REC-xpath-19991116
http://www.w3.org/TR/2002/WD-xptr-xpointer-20021219/
EARL is the result of the work of many people, the editor apologises for any names left out of this list, and will endeavour to rectify any errors noted in comments.
Shadi Abou-Zahra, Chrisoula Alexandraki, Sandor Herramhof, Carlos Iglesias, Nick Kew, Johannes Koch, Jim Ley, Charles McCathieNevile, Yehya Mohamed, Chris Ridpath, Christophe Strobbe, Michael Squillace and Carlos Velasco.