- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 27 Jun 2005 14:35:28 -0400
- To: www-tag@w3.org
- Cc: "Paul Cotton" <pcotton@microsoft.com>
Though I normally leave it to other corporate members to speak for themselves, we no longer have a Microsoft employee on the TAG, and I thought the recent announcement regarding "Office" file formats would be of interest. As I understand it, the next major revision of Microsoft Office will default to saving files in a new XML-based format that uses a zip file container for both structuring and compression. Traditional office files were based on container format known informally as Docfile. Within a single Docfile were multiple logical streams; typically one such stream contained the "main" part of the file, while another might have summary properties (author, etc.), and yet others were used for OLE embeddings. So, if you have a spreadsheet inside a word processor document, that's another stream. In the new formats, the role of the Docfile is now taken over by a Zip-format container. The "main" stream at each level is an XML document, and the zip encoding provides compression. Other streams are used for images (e.g. jpegs), etc. Embeddings are apparently handled by nesting Zip files within the substreams of the outer zip. The files will apparently have default extensions of .docx, .pptx, etc. Partly because the original binary docfiles were actually quite verbose, the new formats are claimed to be much smaller in many cases. Anyway, I mention all this because it seems to bear at least indirectly on the Binary XML discussion. At least one major vendor will be using zip'd XML to achieve compression. FYI, an interesting video interview with one of the Office designers is available in the blog entry at [1]. Also, though I am not particularly expert in OpenOffice, my impression is that there are significant similarities between the Microsoft approach and that deployed in OpenOffice [2,3,4], which also uses XML and Zip. Noah [1] http://channel9.msdn.com/ShowPost.aspx?PostID=73329 [2] http://www-106.ibm.com/developerworks/xml/library/x-think15/ [3] http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office [4] http://www.oasis-open.org/committees/download.php/12572/OpenDocument-v1.0-os.pdf -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Monday, 27 June 2005 18:35:39 UTC