W3C home > Mailing lists > Public > www-tag@w3.org > November 2008

Re: ZIP-based packages and URI references into them ODF proposal

From: Marcos Caceres <marcosscaceres@gmail.com>
Date: Wed, 26 Nov 2008 17:37:59 +0000
Message-ID: <b21a10670811260937n3b31803fy5f16173337eece0@mail.gmail.com>
To: "Larry Masinter" <masinter@adobe.com>
Cc: "Arthur Barstow" <art.barstow@nokia.com>, "Jon Ferraiolo" <jferrai@us.ibm.com>, "Richard Cohn" <rcohn@adobe.com>, "Bill McCoy" <bmccoy@adobe.com>, "Henry.Story@Sun.COM" <Henry.Story@sun.com>, "Michael Stahl" <Michael.Stahl@sun.com>, "www-archive@w3.org" <www-archive@w3.org>, "Svante Schubert" <Svante.Schubert@sun.com>, "eduardo.gutentag@oasis-open.org" <eduardo.gutentag@oasis-open.org>, "Philippe Le Hegaret" <plh@w3.org>, "Carl Cargill" <cargill@adobe.com>, "Stephen Zilles" <szilles@adobe.com>, "www-tag@w3.org" <www-tag@w3.org>

The proposal below seems like something that could be achieved by HTTP
without requiring any new URI scheme. However, you would need some
kind of Apache or IIS module to correctly interpret ODF files.
Consider that you have:

 http://www.oasis-open.org/committees/download.php/19275/OpenDocument.odt/someFile.xml

The server would:
1. Dynamically generate OpenDocument.odt based on whatever parameters
were sent to download.php.
2. Interpret  OpenDocument.odt as a resource of MIME type Open Office
- hence, processing it with the appropriate server module or software
driver.
3. read the /META-INF/manifest file inside the odt document to derive
the MIME type of the resource being served, which for "someFile.xml"
might be application/xml.
4. return someFile.xml with the application/xml MIME type

Hence, it should not be necessary to download the whole
OpenDocument.odt resource just to get at someFile.xml.

Where you do need some URI scheme is when OpenDocument.odt becomes
disassociated from the HTTP server and where someFile.xml makes a
relative link to another resource inside the package. For example,
someFile.xml:

<!doctype html>
<html>
<img src="images/dog.gif" />
</html>

In which case, images/dog.gif would need to dereferenced to some URI.
That URI scheme is the missing bit here and something that has not yet
been standardized.

Kind regards,
Marcos


On Fri, Oct 31, 2008 at 3:08 PM, Larry Masinter <masinter@adobe.com> wrote:
> Excerpted requirements from independent inquiry on same topic:
>
>
>
> =========================
>
>
>
> The problem is is all about identifying resources within a package from
>
> outside a package by an URL.
>
>
>
> In the end we would like to use two independent solutions:  package
>
> schema and fragment identifier
>
>
>
> 1)  Package schema solution:
>
> Specify a package schema in the future by a standards body (perhaps
>
> extent JAR URL) to share among zipped document types, e.g. ODF, Widgets
>
> (W3C Widget group), OOXML, ZIP, etc..
>
> This might take some time to communicate and register at IANA.
>
> Advantage: Even a URL to a file within the package can be used as
>
> BaseURI (have to be absolute no fragment identifier allowed).
>
>
>
> 2)  Fragment identifier solution:
>
> Specify for our ODF MIMETYPEs a fragment identifier similar to the one
>
> used by HTML, just extended for a package format. The problem as we
>
> might have multiple XML files within the package with the same xml:id we
>
> need in addition the path to make the ID unique.
>
> Advantage: Web user know this technic basically from HTML.
>
>
>
>
>
> For time reasons I would like to stick for now with the second solution:
>
>
>
> According to the URI RFC (http://www.rfc-editor.org/rfc/rfc3986.txt)
>
>
>
> URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
>
> fragment      = *( pchar / "/" / "?" )
>
>
>
> I would like to propose for ODF 1.2 the following:
>
>
>
> A fragment identifier for an OpenDocument format mediatype
>
> (odf-fragment) refers to a resource within the OpenDocument package.
>
>
>
> An odf-fragment consists either of a stream-path,  a stream-path and a
>
> fragment identifier of a stream or as abbreviation an xml:id.
>
> The xml:id shall always belong to the root "/content.xml" stream.
>
> The fragment within the ODF fragment has to be resolved accordingly to
>
> the mimetype of the stream.
>
>
>
> odf-fragment = [ "#"  stream-path  | ( stream-path "?" fragment ) |
>
> xml-id ]
>
> stream-path =  "/" pchar *( pchar  |  "/" )
>
> fragment      = *( pchar  | "/" | "?" )
>
> xml-id     =  see W3C xml:id spec
>
>
>
>
>
> The following two URLs are considered equal:
>
> http://www.oasis-open.org/committees/download.php/19275/OpenDocument-v1.0ed2-cs1.odt#/content.xml?id1
>
> http://www.oasis-open.org/committees/download.php/19275/OpenDocument-v1.0ed2-cs1.odt#id1
>
>
>
> Examples:
>
>
>
> http://www.oasis-open.org/committees/download.php/19275/OpenDocument-v1.0ed2-cs1.odt#/Configurations2/accelerator/current.xml
>
> http://www.oasis-open.org/committees/download.php/19275/OpenDocument-v1.0ed2-cs1.odt#/styles.xml?id1
>
> http://www.oasis-open.org/committees/download.php/19275/OpenDocument-v1.0ed2-cs1.odt#id1
>
>
>
>



-- 
Marcos Caceres
http://datadriven.com.au
Received on Wednesday, 26 November 2008 17:38:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:08 GMT