What can be our uniform way to access sub-content of a package via an URI?

How do we access sub-contents from a package file (e.g. ZIP) in a 
uniform way via an URI?

Three distinct user scenarios:

1)
Pointing from an user-agent to a sub-content of a package, desiring to 
load only the sub-content to the client (minimizing traffic).
(e.g. to view only an image bundled within the package)

2)
Pointing from an user-agent to a sub-content of a package, desiring to 
load the complete package, but to view the referenced sub-content
(e.g. to view a certain paragraph within the content XML file of an 
OpenDocument package)

3)
Having a relative reference within the sub-content of a package, which 
must be resolved using a base URI
(e.g. an RDF/XML file within an OpenDocument package uses relative URLs 
to identify sub-contents of the package to be described via RDF. Those 
have to be resolved when building an RDF graph)

4)
A relative reference within the sub-content of a package should 
reference to a file outside the package
(e.g. the content.xml of an OpenDocument package references to an image 
aside the ODF document)



Three distinct solutions:

a) Using a package scheme

No standardized scheme available, but existing proprietary package schemes:
[1] jar: scheme Firefox does understand JAR URLs and used to use them to 
support the concept of signed JavaScript.
[2,3] pack: scheme - provisional IANA scheme used for OOXML

Examples:
pack://http:,,www.foo.com,dir,my.package/a/b/graphic.png
jar:http://www.foo.com/dir/my.package!/a/b/graphic.png

Characteristics of package URI:
- Contains arbitrary URI referencing to the package
- Usable as a valid base URI to resolve relative references within a 
package (SCENARIO 3 - works, SCENARIO 4 - might work, have to be specified)
- Implementation constraint: As the package URI contains an encoded URI, 
that URI have to be resolved first by the client to get the package from 
the server by the encoded protocol.
Client have to resolve the package URL, receives the complete package, 
returns the sub-content (SCENARIO 1 - fails / SCENARIO 2 - works)


b) Using packages transparent in an URI:

A package will be used similar as a directory in the URI path.
Packages and directories are adequate containers.

Example:
http://www.foo.com/dir/my.package/a/b/graphic.png
file:///dir/my.package/a/b/graphic.png

Characteristics of transparent package usage in URIs:
- Would be resolved on the server
- Not common implemented on servers
- Can only return the sub-content (SCENARIO 1 - works / SCENARIO 2 - 
fails as no caching on client side possible)
- Usable as a valid base URI to resolve relative references within a 
package (SCENARIO 3 - works)
- The package file name is equivalent to a directory name (SCENARIO 4 - 
works)


c) Share fragment identifier syntax among all package formats

Characteristics of transparent package usage in URIs:

- Basic set of syntax to be extended by package media types (have to be 
overtaken first)
- media types might extend the set to create their own abbreviations
(e.g. in ODF documents most xml:ids are in content.xml therefore the 
following two URL are meant to be equal:
    http://server/path/file.odt#/content.xml?id
    http://server/path/file.odt#id
)
- The fragment identifier is separated from the rest of the URI prior to 
a dereference [4] (SCENARIO 1 - fails / SCENARIO 2 - works)
- NOT usable as a valid base URI to resolve relative references within a 
package (SCENARIO 3 - fails, SCENARIO 4 - fails)

Example:
http://www.foo.com/dir/my.odt#/a/b/test.html?id

For the OpenDocument package format we use now the solution b), with a 
new proposal of using fragment identifiers c) [5].
With the constraint that for the resolution of relative URLs in RDF only 
solution b) should be used, to avoid the creation of URI alias in RDF 
graphs.

Regards,
Svante

[1] http://java.sun.com/javase/6/docs/api/java/net/JarURLConnection.html
[2] http://www.IANA.org/assignments/uri-schemes/prov/pack
[3] http://tools.ietf.org/id/draft-shur-pack-uri-scheme-04.txt
[4] http://tools.ietf.org/html/rfc3986#section-3.5
[5] 
http://wiki.oasis-open.org/office/Change_Proposal_for_ODF_1.2_using_URL_fragment_identifiers_for_ODF_media_types

Received on Thursday, 4 December 2008 18:37:19 UTC