- From: Takuki Kamiya <tkamiya@us.fujitsu.com>
- Date: Tue, 13 Oct 2015 17:40:47 -0700
- To: "public-exi@w3.org" <public-exi@w3.org>
Hi, A picture depicting the whitespace preservation rule currently implemented in TTFMS in comparing the original document with the EXI-encoded document can be seen at [1]. First of all, xml:space="preserve" is respected when it is in effect in the document whether it is schema-informed or schema-less. This means, all whitespaces are preserved. When the current xml:space is *not* "preserve", the following rules apply. If it is schema-informed: - For simple data (data between s+e i.e. start-tag followed by end-tag), apply lexical rule. We should use whiteSpace facet for this purpose. - For complex data (data between s+s, e+s, e+e), whitespaces nodes (i.e. strings that consist solely of whitespaces) are removed. If it is schema-less: - Simple data (data between s+e) are all preserved. - For complex data, it is same as schema-informed case. We could use a similar rules for defining how whitespaces in the input infoset are treated. There is an issue when the encoder uses schema-informed strict-grammar and xml:space is "preserve". For example, " 123 " typed as xsd:int cannot preserve the heading and trailing whitespace when typed datatype representation is used. [1] https://www.w3.org/XML/EXI/wiki/File:WhiteSpace_handling_in_TTFMS.jpeg Takuki Kamiya Fujitsu Laboratories of America
Received on Wednesday, 14 October 2015 00:41:28 UTC