- From: Vogelheim, Daniel <daniel.vogelheim@siemens.com>
- Date: Mon, 27 Aug 2007 16:20:34 +0200
- To: "Stanley A. Klein" <sklein@cpcug.org>
- Cc: <public-exi@w3.org>
Hello Stan, Thanks for the reply. Here's some more comments from WG discussion of your case that I hope would be helpful: You wrote: > [...] No on-line negotiation would be involved, just a knowledge that the > particular implementation is intended for that particular use case. [...] OK. If you are willing to restrict compatibility to a particular user community that opens some additional options: > You already have something called a pluggable codec, that I don't > understand. Pluggable codecs are an extension mechanism, to allow user communities with some particular requirements to use EXI as a building block. The spec contains 1) a mechanism to uniquely identify such pluggable codecs, and 2) a MAY conformance statement which essentially warns users that by using their own pluggable codecs they are confined to implementations that support it. You could absolutely implement such a custom, pre-populated string table as a custom codec. Given that much of the logic would be in any conforming implementation anyway this should be relatively easy to do. But, as said, this would limit compatibility. > Regarding the option of using enumerations in the schema, > this would be difficult for my use case. The messages generally > consist of object names and object values. The object names > are constructed of standardized sub-strings concatenated in a > manner similar to file naming. The actual construction of the > object names is difficult to describe in a schema and > leads to numerous complications, so it is best to just define > the names as strings. The schema can be of some help regarding > strings in element and attribute names, but not the object names. I'm not sure I fully understand. I had assumed that you'd be able to assemble some potentially large but finite set of strings that you know are likely to occur, to pre-populate the string table. If so, it should be possible to use that same string set to specify an enumeration. Please note that there is no need to exhaustively describe all possible strings: EXI uses a schema as an indicator of what is likely to occur; it does not limit what can be encoded. Also note that for some use cases it may be useful to have separate schemas for encoding (describing the likely content) and validation (describing all possible content). What I'm suggesting here is to use an enumeration to inform the encoder about the likely content, without imposing the need to describe all possible (but unlikely) variants. If the issue is an unpredictable set of usually unique strings with a high proportion of repeated sub-strings (such as the directory part of a filename) then I'm afraid neither a pre-defined string table nor an enumeration will be of much help. In that case, the reordering-plus-compression would likely do rather well. Also a change in the format where e.g. each path element receives its own XML element would likely work very well with the standard algorithm, in cases where a format change is an option. > Regarding the option of using fragments. This would be > useful, as long as > the EXI document/fragment itself is a fictional construct that never > actually exists and whose history except for string tables > and some other > state information can be forgotten after each message is > processed. Yes, that is exactly the idea. Of course, it would require some care in the implementation to enable this. Sincerely, Daniel
Received on Monday, 27 August 2007 14:20:57 UTC