Re: Comments on XBC Use Cases

Normal Walsh wrote:

> I don't know why the document states that an index requires a binary
> format. I'm pretty sure that the amount of cleverness necessary to put
> an index at the beginning of a text document is managable. I'm also
> not convinced that "update" requires a binary format, though I concede
> that the challenges are significant.

When the openoffice.org team was designing their file format, they considered
putting indexes at the beginning but found it unfeasable because it non-XML.  My
interpretation is it would become a new format, backwards-compatible for reads
but not writes, because the indexes wouldn't be updated by unaware processors. 
The index can not be trusted.

The document is http://xml.openoffice.org/package.html and the explaination is:

Note: In the discussion lists, several remedies to allow on-demand reading were
suggested. Either the inclusion of indices in the XML file, or placing the
binary data at the end of the file. The former was considered to be non-XML (as
it relies on the physical layout of the XML data) and useless, since any
standard XML tool would not update the index and thus OpenOffice couldn't rely
on it. The latter can not be done with standard (e.g. SAX-based) parsers, as
they are used in the current implementation. The SAX API is necessary for the
filter pipelining mentioned in the requirements section, so forgoing SAX may
have significant drawbacks.

-- 
Devin Bayer

Received on Wednesday, 13 April 2005 00:54:43 UTC