by Garret Wilson, GlobalMentor, Inc.
DRAFT 20010810
This draft was created for use with the Open eBook Publication Structure [OEBPS] 2.0 draft and other in-progress specifications.
The World Wide Web Consortium's [XML] specification has seen wide acceptance as a method of encoding textual content for display, for maintaining metadata, and for storing other forms of data. As many independent XML documents are increasingly being used in various situations that require these documents to interact in some fashion, there is becoming a need to maintain information in a standard form at a level above independent documents. Such information is necessary not only for illustrating to bundling applications the existence of relationships but also for defining the semantics of those relationships. This high-level group-related information includes:
This specification describes how a group of related files can be described as a single bundle or package, using XML as a storage format for package-related information. This specification does not dictate a particular physical bundling mechanism of the files identified in the package document, although several alternatives are recommended.
This packaging specification uses the [XLink] specification as a linking model for identifying resources and describing relationships among those resources. The data model borrows heavily from the model currently in use by the Open eBook Publication Structure version 1.0.1 [OEBPS], as well as some concepts in the IMS Content Packaging Specification [IMS] version 1.1.1.
The packaging syntax presented here in some instances mandates certain semantics, but in most cases allows expression of custom semantics through the extension of the general packaging syntax. The OEB Publication Structure 2.0 specification, for example, may attach certain semantics and behaviors a package processor must perform on relationships among XML documents and various XML namespaces, based upon the particular values expressed for those relationships.
The manifest defines all the resources contained in the package, as well as the relationships between various resources in the package. All manifest definitions appear in the <manifest>
element, which is illustrated as follows in the following non-normative DTD fragment:
<!ELEMENT manifest (item+, namespace*, relationship*)>
<!ATTLIST manifest
%CommonAttributes;
xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink"
xlink:type (extended) #FIXED "extended"
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPLIED
>
Manifest elements implicitly declare themselves to be extended XLink elements as specified by [XLink] by the use of several XLink attributes. [TBD transfer to OEB-specific section: These attributes are defined in the package DTD as taking on fixed or implied values, so an OEBPS 1.x package need not modify its <manifest>
element to become an OEBPS 2.0 <manifest>
element.] [TBD: The one exception is that each <manifest>
element must include the definition of the XLink namespace, to compensate for non-validating reading systems. The examples in this specification do not include that declaration out of a need for brevity; this does not imply that the declaration is optional.]
One purpose of the manifest is to provide a definitive list of all resources used by the package, such as documents and images. These resources may be located locally or remotely, and these locations may be represented by relative or absolute links. Although this packaging specification mandates no package archive format for the resources listed in the manifest, or even whether such bundling would ever occur, the presence of a resource in the manifest in an <item>
element implies that it should be included in any package archive bundling if it were to take place. Manifest items are therefore included by value in the package.
Each resources is located using an <item>
element that specifies the resource's location. The <item>
element must specify the media type of the resource as defined by [TBD: RFC XXXX] through the use of the media-type
attribute. The <item>
element may also specify a unique ID for the resource by using the id
attribute so that other sections of the package may refer to the resource without ambiguities.
Each manifest item must indicate a named separable resource such as a file. A manifest item may not reference a segment of a resource, such as an XML fragment, that cannot be independently included in the package archive should any physical bundling occur.
The <item>
element is illustrated by the following non-normative DTD fragment:
<!ELEMENT item EMPTY>
<!ATTLIST item
id ID #REQUIRED
xlink:type (locator) #FIXED "locator"
xlink:href CDATA #REQUIRED
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:label NMTOKEN #IMPLIED>
media-type CDATA #REQUIRED
>
As with the <manifest>
element, several XLink-related attributes are defined as implied in the package DTD. Specifically, the <item>
element is defined to be an XLink locator that can optionally accept a title and a label. A non-normative example of identifying several resources by using <item>
elements is shown below:
<manifest>
<!--items-->
<item id="package" media-type="text/xml" xlink:href="package.xml" xlink:label="packageLabel"/>
<item id="nav-md" media-type="text/xml" xlink:href="navMain.ncf"/>
<item id="bookrightsID" media-type="text/xml" xlink:href="bookrights.xml" xlink:label="bookrightsLabel"/>
<item id="prefaceID" media-type="text/html" xlink:href="preface.html" xlink:label="document"/>
<item id="chapter1ID" media-type="text/html" xlink:href="chapter1.html" xlink:label="document"/>
<item id="image1gifID" media-type="image/gif" xlink:href="image1.gif" xlink:label="image1gifLabel"/>
<item id="image1pngID" media-type="image/png" xlink:href="image1.png" xlink:label="image1pngLabel"/>
<item id="stylesheet1ID" media-type="text/css" xlink:href="stylesheet1.css" xlink:label="stylesheet"/>
<item id="stylesheet2ID" media-type="text/css" xlink:href="stylesheet2.css" xlink:label="stylesheet"/>
</manifest>
This example includes files of several types: document files of type text/xml
, image files of both type image/gif
and image/png
, stylesheets of type text/css
, as well as other files of type text/xml
the purposes of which are not immediately evident. That is, while "chapter1.html" and "bookrights.xml" are both specified to have the same media type of "text/xml", one might assume that these resources have different uses in the package — in fact, the former is a document while the latter specifies certain metadata. Such an assignment of purpose, however, is a level of semantics not specified in the <item>
element. The <item>
only declares that a particular resouce is included in the package.
Several elements declare an optional label as defined by [XLink]. Such labels may be used for linking when specifying relationships. The use of xlink:label
will be explained in the [TBD: link:] relationship section.
Note that this manifest identifies the package document file itself as being included in the package. As the package document not only defines the package but would also be required for any package processor to understand a package archive, the package document is required to be referenced from an <item>
element in the manifest. The presence of the package document as an item in the manifest also allows resources such as metadata to be associated with the package as a whole.
[TBD move to OEB-specific section: There are two main differences between the OEBPS 1.x <item>
element and the XLink-based OEBPS 2.0 <item>
element. The first is that the reference to each item has changed from the href
attribute to the xlink:href
attribute in the XLink namespace. Secondly, the fallback
attribute is no longer a part of the <item>
element; fallbacks are now declared in a consistent manner with other relations and are explained in the [TBD: link:] relationship section.]
[TBD: This section is new and has not been reviewed.]
The manifest may also declare resources referenced by the package but that do not necessarily need to be included in the package archive, if any. Such a reference therefore constitutes an inclusion by reference of a particular resource. A package may, for instance, indicate several relationships to the web location http://www.w3.org
without implying that the World Wide Web Consortium's web site should be bundled with any package archive.
The difference between manifest items and manifest references is similar to the difference between the concepts of composite aggregation and shared aggregation, respectively, in the [UML]. All manifest items must be included in a package and the package has sole responsibility for the disposition of the manifest items — the strong form of aggregation. A package only maintains weak ownership over manifest references, and the targets of references live independently of the package.
References are defined by the <reference>
element, as illustrated by the following non-normative DTD fragment:
<!ELEMENT reference EMPTY>
<!ATTLIST reference
id ID #REQUIRED
xlink:type (locator) #FIXED "locator"
xlink:href CDATA #REQUIRED
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:label NMTOKEN #IMPLIED>
media-type CDATA #REQUIRED
>
Attributes for the <reference>
element take on the same semantics as for the <item>
element. As references do not imply any inclusion into any physical representation of the package, the xlink:href
attribute may refer not only to whole file-based resources but also to fragments of entities using fragment identifiers and XPointer.
A package may contain fragment <reference>
element references to be used in relationships ([link] see below) while including a reference in an <item>
element to the composite item from which the fragments are drawn, although this is not required. Whether or not the non-fragment resource is included as an <item>
element will be determined by the composite or shared aggregation needs of that resource, not on the presence of fragment <reference>
elements.
A non-normative example of identifying several weakly-owned resources by using <reference>
elements is shown below:
<manifest>
<!--items-->
<item id="bookrightsID" media-type="text/xml" xlink:href="bookrights.xml" xlink:label="bookrightsLabel"/>
<!--references-->
<item id="w3c" media-type="text/html" xlink:href="http://www.w3.org/index.html"/>
<item id="readRights" media-type="text/xml" xlink:href="bookrights.xml#read" xlink:label="readRightsLabel"/>
</manifest>
In this example, the web page http://www.w3.org/index.html
is referenced using weak ownership. A section of the bookrights.xml
document is identified for later use. The entire bookrights.xml
document is also included as an <item>
element implying strong ownership, although the bookrights.xml#rights
fragment could have been referenced with or without such an <item>
element. In this configuration, the bookrights.xml
must be included with the package archive, if any.
Besides declaring items to be components of a package, a manifest also declares relationships between those items. Such relationships might include a resource acting as a stylesheet to another items, a resource acting as metadata for another item, or a resource acting as an alternative to another item so that it can be used as a fallback. Relationships can be many-to-many, allowing one or more items to be related to one or more other files.
A relationship is declared using a <relationship>
element. Each <relationship>
element is an [XLink] arc, using the xlink:from
and xlink:to
attributes to specify an item and the item(s) related to the first item, respectively. An XLink xlink:arcrole
must be specified to specify the semantic purpose of the relationship. The syntax of the <relationship>
element is described below using a non-normative DTD fragment:
<!ELEMENT relationship EMPTY>
<!ATTLIST relationship
xlink:type (arc) #FIXED "arc"
xlink:arcrole CDATA #REQUIRED
xlink:title CDATA #IMPLIED
xlink:show (other) #IMPLIED
xlink:actuate (other) #IMPLIED
xlink:from NMTOKEN #IMPLIED
xlink:to NMTOKEN #REQUIRED
scheme NMTOKEN #IMPLIED
support (default|optional|required) #IMPLIED
>
The <relationship>
element uses the xlink:from
and xlink:to
attributes to specify one or more items in the manifest, and one or more items that should be related to the first item(s), respectively. These attributes must each contain a value specified by one or more <item>
element xlink:label
attribute. The xlink:label
attribute does not have to be unique within the manifest, allowing relationships to be one-to-one, one-to-many, many-to-one, or many-to-many.
Several XLink-related attributes are specified as "fixed" or "implied in the DTD and therefore may not be included in the <relationship>
element. [TBD: Although XLink allows an arc to have one of several values for the xlink:show
and xlink:actuate
attributes, the manifest <relationship>
element restricts those allowed values.] The optional XLink xlink:title
attribute may be included to describe the meaning of the relationship in human-readable form.
Some types of relationships may use the scheme
attribute to give more information about how the information in the related resource is to be interpreted by the relation. Version 2.0 of the OEBPS specification, for example, may identify that metadata related to a particular XML document should be interpeted as ONIX metadata through the use of the "onix" scheme. The values of the scheme
attribute are unrestrained by this specification.
The level of support needed by a relationship may be specified as a hint for the benefit of an application by use of the support
attribute. The accepted levels of support are indicated by the values "default", "optional", "required". These exact semantics of these values are governed by the paticular role specified in the xlink:arcrole
attribute. The use of the support
attribute is discussed further in [TBD: link:] the namespace section.
Using the sample manifest presented above, a manifest could declare the following relationships among items by using this non-normative example:
<manifest>
<!--items-->
<item id="package" media-type="text/xml" xlink:href="package.xml" xlink:label="packageLabel"/>
<item id="nav-md" media-type="text/xml" xlink:href="navMain.ncf"/>
<item id="bookrightsID" media-type="text/xml" xlink:href="bookrights.xml" xlink:label="bookrightsLabel"/>
<item id="prefaceID" media-type="text/html" xlink:href="preface.html" xlink:label="document"/>
<item id="chapter1ID" media-type="text/html" xlink:href="chapter1.html" xlink:label="document"/>
<item id="image1gifID" media-type="image/gif" xlink:href="image1.gif" xlink:label="image1gifLabel"/>
<item id="image1pngID" media-type="image/png" xlink:href="image1.png" xlink:label="image1pngLabel"/>
<item id="stylesheet1ID" media-type="text/css" xlink:href="stylesheet1.css" xlink:label="stylesheet"/>
<item id="stylesheet2ID" media-type="text/css" xlink:href="stylesheet2.css" xlink:label="stylesheet"/>
<!--relationships-->
<relationship xlink:from="document" xlink:to="stylesheet" xlink:arcrole="http://openebook.org/linkprops/css" />
<relationship xlink:from="image1gifLabel" xlink:to="image1pngLabel" xlink:arcrole="http://openebook.org/linkprops/fallback" />
<relationship xlink:from="packageLabel" xlink:to="bookrightsLabel" xlink:arcrole="http://openebook.org/linkprops/metadata" scheme="xrml"/>
</manifest>
In this example, all items labeled as a "document" are shown to be related to all items labeled as a "stylesheet", and this relationship is specified as having an arcrole
of "http://openebook.org/linkprops/css". As all documents have been labeled as "document" and all stylesheets have been labeled as "stylesheet", this relationship declares that all stylesheets should be interpeted as stylesheets in relation to all documents, thus constituting a many-to-many relationship. If it were necessary that different stylesheets be associated with different documents on an individual basis, each document and each stylesheet would have been given unique labels in a manner similar to the use of the id
attribute, with multiple corresponding <stylesheet>
elements.
[TBD: What about stylesheet titles? Should we break out the stylesheets into different relationships and add an xlink:title to each? Probably.]
The "image1.gif" item declares a resource of type "image/gif". This example provides an alternate image of a different media type ("image/png") so that an application could use the latter as an alternative for the former if the application did not understand the "image/gif" media type. This relationship therefore constitutes a one-to-one relationship.
The relationship to the "bookrights.xml" item is declared using an xlink:from
attribute that references the package document itself. Such a declaration indicates that the related resource (in this case, metadata) relates to the entire package as a whole. A relationship to the package must reference the package document using the xlink:from
attribute.
A relationship without a xlink:from
attribute does not indicate a relationship with the package as a separate entity, but rather indicates a separate relationships with every item in the manifest.
[TBD: What should the stylesheet arcrole value be? Something standard?]
[TBD: Define each standard OEB arcrole.]
[TBD: Should each fallback be labeled a "fallback" or more generically, an "alternate"?]
Relationships may also be formed between items and particular namespaces. Namespace relationships only have meaning in the context of XML documents — items that have "text/xml" or "text/html" as a media type or have a media type ending with "+xml". The required support of a particular namespace may be declared for the benefit of a particular application.
Namespaces, like file-based resources, must first be declared before participating in any relationships. Namespaces take on the XLink resource type as defined by [XLink], and are represented through the use of a <namespace>
element as illustrated below in the non-normative DTD fragment:
<!ELEMENT namespace ANY>
<!ATTLIST namespace
xlink:type (resource) #FIXED "resource"
xlink:role CDATA #FIXED "http://openebook.org/linkprops/namespace"
xlink:title CDATA #IMPLIED
xlink:label NMTOKEN #IMPLIED
>
[TBD: Should we change the content model from ANY to something else for namespace URIs? Actually, the XML Schema specification should fix this.]
[TBD: Can we leave out xlink:title?]
[TBD: What value should the xlink:role attribute take on?]
A <namespace>
element for defining a namespace resource is therefore analogous to an <item>
element for defining a file-based resource. Several XLink attributes are fixed or implied in the DTD and may be ommitted in the namespace declaration in the manifest. A label is given to a namespace declaration so that it may be included in a relationship, and the namespace URI must be included as text content of the <namespace>
element. This is illustrated in the non-normative example below:
<manifest>
<!--namespaces-->
<namespace label="xhtml">http://www.w3.org/1999/xhtml</namespace>
<namespace label="mathml2">http://www.w3.org/1998/Math/MathML</namespace>
<namespace label="qti">http://www.imsproject.org/xsd/ims_qti_rootv1p1</namespace>
</manifest>
Relating one or more namespace to one or more items uses the same relationship syntax decribed earlier for use among other manifest items. A manifest can thereby express that a particular item uses one or more namespaces, as well as provide an indication of the level of support an application should have for the listed namespaces. In the context of namespaces, an application should interpret the values of the relationship support
attribute as follows:
A non-normative example of declaring relationships between items and namespaces is presented below:
<manifest>
<!--items-->
<item id="package" media-type="text/xml" xlink:href="package.xml" xlink:label="packageLabel"/>
<item id="nav-md" media-type="text/xml" xlink:href="navMain.ncf"/>
<item id="bookrightsID" media-type="text/xml" xlink:href="bookrights.xml" xlink:label="bookrightsLabel"/>
<item id="prefaceID" media-type="text/html" xlink:href="preface.html" xlink:label="document"/>
<item id="chapter1ID" media-type="text/html" xlink:href="chapter1.html" xlink:label="document"/>
<item id="image1gifID" media-type="image/gif" xlink:href="image1.gif" xlink:label="image1gifLabel"/>
<item id="image1pngID" media-type="image/png" xlink:href="image1.png" xlink:label="image1pngLabel"/>
<item id="stylesheet1ID" media-type="text/css" xlink:href="stylesheet1.css" xlink:label="stylesheet"/>
<item id="stylesheet2ID" media-type="text/css" xlink:href="stylesheet2.css" xlink:label="stylesheet"/>
<!--namespaces-->
<namespace label="xhtml">http://www.w3.org/1999/xhtml</namespace>
<namespace label="mathml2">http://www.w3.org/1998/Math/MathML</namespace>
<namespace label="qti">http://www.imsproject.org/xsd/ims_qti_rootv1p1</namespace>
<!--namespace relationships-->
<relationship xlink:from="document" xlink:to="xhtml" xlink:arcrole="http://openebook.org/linkprops/namespace" support="required"/>
<relationship xlink:from="document" xlink:to="qti" xlink:arcrole="http://openebook.org/linkprops/namespace" support="optional"/>
</manifest>
This example declares that all items labeled "document" (the two non-metadata XML documents in this case) should be associated with two namespaces: the namespace for XHTML, and the namespace for [TBD link] Question and Test Iteroperability (QTI). In the former case, it is indicated by the "required" value that the application must understand the special semantics inherent in the XHTML vocabulary (such as those represented by the XHTML <img>
element) and have the capabilities to display the appropriate presentation styling. The QTI namespace is marked as "optional", meaning that while special understanding of the QTI vocabulary may be necessary for full rendering of QTI elements, sufficient styling may be displayed through attached stylesheets even by applications that do not recognize the special semantics of the vocabulary indicated by the QTI namespace.
Packages may physically be stored in a variety of configurations, including:
If one of the above configurations are used as a package archive, the package document file or files must be located in the root of the directory hierarchy.
More than one package may be stored in a package archive. In that case, each package must be represented by a unique package document file.
The following is a non-normative example of how the XML package might be used to define an Open eBook Publication Structure [OEBPS] version 2.0 publication. This example contains XML documents with various namespaces, images, stylesheets, metadata associated with XML documents, metadata associated with the publication, as well as guides that identify particular portions of the publication as having particular semantics to an OEBPS reading system.
Because the relationships in this example are moderately complex, it is not possible to use labels to identify groups of resources for participating in relationships. Rather, each xlink:label
attribute takes a unique value and become in essence a redundant id
attribute, except that the label retains its XLink and package semantics for use in relationships.
<manifest>
<!--items-->
<item id="package" media-type="text/xml" xlink:href="package.xml" xlink:label="packageLabel"/>
<item id="nav-md" media-type="text/xml" xlink:href="navMain.ncf"/>
<item id="bookrightsID" media-type="text/xml" xlink:href="bookrights.xml" xlink:label="bookrightsLabel"/>
<item id="prefaceID" media-type="text/html" xlink:href="preface.html" xlink:label="prefaceLabel"/>
<item id="chapter1ID" media-type="text/html" xlink:href="chapter1.html" xlink:label="chapter1Label"/>
<item id="image1gifID" media-type="image/gif" xlink:href="image1.gif" xlink:label="image1gifLabel"/>
<item id="image1pngID" media-type="image/png" xlink:href="image1.png" xlink:label="image1pngLabel"/>
<item id="generalStylesheetID" media-type="text/css" xlink:href="generalStylesheet.css" xlink:label="generalStylesheetLabel"/>
<item id="prefaceStylesheetID" media-type="text/css" xlink:href="prefaceStylesheet.css" xlink:label="prefaceStylesheetLabel"/>
<!--references-->
<item id="w3c" media-type="text/html" xlink:href="http://www.w3.org/index.html"/>
<item id="readRights" media-type="text/xml" xlink:href="bookrights.xml#read" xlink:label="readRightsLabel"/>
<!--namespaces-->
<namespace label="xhtml">http://www.w3.org/1999/xhtml</namespace>
<namespace label="mathml2">http://www.w3.org/1998/Math/MathML</namespace>
<namespace label="qti">http://www.imsproject.org/xsd/ims_qti_rootv1p1</namespace>
<!--package relationships-->
<relationship xlink:from="packageLabel" xlink:to="bookrightsLabel" xlink:arcrole="http://openebook.org/linkprops/metadata" scheme="xrml"/>
<!--item relationships-->
<relationship xlink:from="prefaceLabel" xlink:to="generalStylesheetLabel" xlink:arcrole="http://openebook.org/linkprops/css" />
<relationship xlink:from="prefaceLabel" xlink:to="prefaceStylesheetLabel" xlink:arcrole="http://openebook.org/linkprops/css" />
<relationship xlink:from="chapter1Label" xlink:to="generalStylesheetLabel" xlink:arcrole="http://openebook.org/linkprops/css" />
<relationship xlink:from="chapter1Label" xlink:to="readRightsLabel" xlink:arcrole="http://openebook.org/linkprops/metadata" scheme="xrml"/>
<relationship xlink:from="image1gifLabel" xlink:to="image1pngLabel" xlink:arcrole="http://openebook.org/linkprops/fallback" />
<!--OEB guide relationships-->
<relationship xlink:from="packageLabel" xlink:to="prefaceLabel" xlink:arcrole="http://openebook.org/linkprops/guide" scheme="preface"/>
<!--namespace relationships-->
<relationship xlink:from="document" xlink:to="xhtml" xlink:arcrole="http://openebook.org/linkprops/namespace" support="required"/>
<relationship xlink:from="document" xlink:to="qti" xlink:arcrole="http://openebook.org/linkprops/namespace" support="optional"/>
</manifest>
This OEB publication lists the following files. The functions of the files shown in parentheses are not expressed in the manifest, and are included here for clarification:
package.xml
(package)navMain.ncf
(navigation)bookrights.xml
(rights metadata)preface.xml, chapter1.html
(documents)image1.gif, image1.png
(images)generalStylesheet.css, prefaceStylesheet.css
(stylesheets)The entire package's rights metadata is specified by relating the package to the bookrights.xml
document. A fragment of the bookrights.xml
file (bookrights.xml#read
) is specified as the rights used for chapter1.html
.
Both preface.html
and chapter1.html
are associated with the stylesheet generalStylesheet.css
, while preface.html
is additionally associated with the stylesheet prefaceStylesheet.css
.
[TBD: Define orders of priorities in, for example, stylsheet association and fallbacks.]
[TBD: A package as a shared databases of elements, used by other shared databases?]