[EXI] : XML tag handling in encoding and decoding

Hi,

Current EXI standard doesn't talk about how to handle XML tags. When
encoding XML file with XML tag different implementations have different
handling.

Just take the following XML document:

a.xml:

<?xml version="1.0" standalone="yes" ?>

<!DOCTYPE animal SYSTEM "a.dtd" [

   <!ELEMENT animal EMPTY>

]>

<!-- This is against VC: Standalone Document Declaration in P32

 The standalone document declaration has the value "yes", there is an
external 

 markup declaration of attributes with default values, and the associated 

 element appears in the document with specified values for those attributes.


-->

<animal/>

a.dtd:

<!ATTLIST animal color CDATA #FIXED "yellow">

 

This is an invalid xml document because standalone="yes" but still it has
external dtd inclusion (declaration). AND note this is a XML VALIDITY error
and not WELL FORMEDNESS error.

Bow if an EXI user encodes this document and decodes it, the XML document
may look something like this: (comment removed)

<?xml version="1.0" encoding="UTF-8"?>  --- "NO Way for decoder to know
Standalone status so it may drop that attribute (user takes default value
"no")"

<!DOCTYPE animal SYSTEM "a.dtd" [

   <!ELEMENT animal EMPTY>

]>

<animal color="yellow"/>

Now this document is perfectly valid and well formed because by default
standalone is no, and hence the document becomes invalid.

Here BASIC requirement of EXI will be gone, i.e. EXI acts as encoder and
decoder and should not alter the XML content meaning by any means.

Assume same scenario in network level validation, user and service provider
especially will be in lot of trouble.

So, I feel the XML tag must be handled in some way by EXI so, that XML
content is not changed.

I have listed some other scenario also which will have this kind of effect:

1. I have gb312 document and I want to send on network, -- what happens
(some may produce EXI stream in gb2312 and some in utf-8 after conversion
using ICONV). This again becomes EXI implementation issue, so definitely we
will get interoperability issues.

       -- Even if the documents are equivalent in any implementation, this
again as effect on EXI user who just acts as a DATABASE for storing XML
files.

       He might be using EXI to reduce the storage space, but he will end up
distorting the basic meaning of DATABASE, i.e. to store data as it is
without altering or losing data given for storing.

2. Consider another scenario, New version of XML is introduced (say 2.0 .
this might be possible). The validating procedures for xml1.0 and 2.0 most
probably will not be same. Lot of question will arise here...

So, to handle all these issues I feel EXI needs to handle XML tag. Is there
any reason for not handling same?

Please clarify.

 

Thanks,

murali

 

Received on Friday, 25 July 2008 06:07:24 UTC