RE: [LC-2185] RE: Question about EXI Draft - XML declaraion

Tamiya-san,

First, please accept my apology. Long ago, we answered the first part of
your question about EXI's treatment of the XML "version" pseudo-attribute
and promised a second response to answer your question about the
"standalone" pseudo-attribute, but never produced this second response. This
was my fault and I apologize for dropping the ball. I hope this late
response is still helpful.

The standalone document declaration has been the subject of some
long-standing discussions in the XML community. There was a particularly
good discussion by some of XML's founding fathers on XML-DEV several years
ago [1]. For background, I've included some relevant highlights from this
discussion following this message [see below]; however, the thread ends with
the conclusion "... the problem the SDD [standalone document declaration]
exists to solve will essentially never arise in real operational scenarios
anyhow." Today's widely available, commonly used XML parsers have no
difficulty processing external declarations and it is safer to process them
than trusting the standalone claim made by the document creator. In
addition, the processing overhead required to set standalone to "true" can
be significant (see [2]).

As such, EXI takes the simpler, safer and more efficient approach of
essentially assuming all documents have the default standalone value of
"no". If document creators don't want recipients to have to process external
declarations, they can simply avoid using them. In addition to simplifying
EXI, this avoids the compactness and processing efficiency penalties
associated with using standalone="yes".

I hope this helps explain our rationale for omitting the standalone
pseudo-attribute. Please accept my apologies again for the late reply.

	Best wishes,

	John

[1] http://xml.coverpages.org/standalone19980508.html
[2] http://www.w3.org/TR/xml/#vc-check-rmd

John Schneider
AgileDelta, Inc.
http://www.agiledelta.com
 
 
-------- Relevant highlights from [1] ------------

The thread at [1] starts out with Paul Prescott asking:
 
"Is the standalone document declaration bogus and perhaps dangerous?"
 
Following some discussion, David Megginson's reply is:
 
"Yes and yes.  

The problem, I think, came from the mistaken idea that people (i.e.
desperate Perl hackers) would write custom parsers for each XML application
(like RDF), and that these people would not want to deal with seemingly
difficult problems like external entity resolution.  

In the end, as one might have predicted, there is an impressive range of
free XML processors available in several different programming
languages: someone writing an RDF tool does not need to worry about the
character and entity level of XML at all, and can work with XML easily
through a more abstract interface such as the DOM or SAX.

So, we should let the authors decide -- if an author creates a document
referencing external entities (including an external DTD subset), then the
XML parser should handle them; if the author does not want to use external
entities, then she can simply avoid referencing any."
 
After correcting a minor problem with one of Paul's examples he continues:
 
"That said, I still agree that the standalone declaration is wrong.
Perhaps some day, if there's an XML 1.1, we can think about fixing it."

Later in the thread, Tim Bray says:
 
"Having said that, Paul did raise a valid concern about the SDD (too bad
this issue wasn't pointed out before the spec was frozen).  Having said
*that*, I think, for reasons that are on the record in the same place, that
the problem the SDD exists to solve will essentially never arise in real
operational scenarios anyhow."
 
> -----Original Message-----
> From: public-exi-comments-request@w3.org 
> [mailto:public-exi-comments-request@w3.org] On Behalf Of Taki Kamiya
> Sent: Wednesday, January 07, 2009 12:12 PM
> To: 'TAMIYA Keisuke'; public-exi-comments@w3.org
> Cc: youenn.fablet@crf.canon.fr; fujisawa.jun@canon.co.jp
> Subject: [LC-2185] RE: Question about EXI Draft - XML declaraion
> 
> 
> Hi Tamiya-san,
> 
> This response attempts to address one of the two issues you 
> brought up in  your comment with regards to XML declaraion.  
> Below we provide the rationale on which we based our decision 
> not to provide direct support for XML "version" 
> pseudo-attribute in EXI format. Another response is planned 
> for addressing "standalone", which we consider is related to, 
> yet independent of XML "version" discussion that we describe here.
> 
> The version of XML that occurs in the XML declaration is for 
> indicating the slightly different syntax rules implied by 
> each XML version (i.e. XML 1.0 vs XML 1.1 as of this writing).
> 
> EXI format is a representation of XML Information Set [1]. We 
> are aware that the Document Information Item [2] in Infoset 
> provides a "version" property that corresponds to the XML 
> version. However, the value of that property does not imply 
> different semantics that need to be captured at the Infoset level.
> These are the reasons that explain why EXI is, as well as 
> should, be agnostic about the version of XML.
> 
> In some anticipated scenarios of EXI use, application 
> programs are concerned only of the infoset, with no 
> involvement of serialization in XML at any point of the 
> processing and communication chains. In such applications, "version"
> property of Document Information Item would not provide any benefits.
> 
> Also, in applications where serialization of infoset in XML 
> is involved in conjunction with EXI along the way of 
> computing chains, the preservation of the original XML 
> version is rarely concerned. This is because the programs 
> that consume the data are again are, more often than not, 
> only concerned of the infoset, not particularly of the subtle 
> discrepancy of the XML 1.x syntax.
> The recent publication of XML 1.0 5th edition [3] in a sense 
> has made this argument more indisputable, given that the one 
> single most outstanding discrepancy that was present between 
> the XML 1.0 and 1.1, the repertoire of characters, is now 
> essentially dissolved.
> 
> Yet, we understand that there are use cases where the use of 
> a particular version of XML is required when serializing 
> infoset into XML. On such occasions, it is the program that 
> subsequently consumes the serialized XML that calls for a 
> particular XML version. We consider XML version as the 
> artifact of XML serialization, and therefore is the function 
> of XML serializer implementations, instead of being something 
> that has to be inherited from the source XML if any that was 
> fed into the computing chain as an input.
> 
> As described above, we do not foresee critical issues to be 
> caused by not providing the placeholder field in EXI format 
> for carrying text XML version numbers. On the other hand, 
> there could be substantive cost if EXI supports XML version 
> numbers in the grammar system, because doing so would cause 
> every instance of EXI streams to grow slightly in size even 
> when the XML version value is absent. One of the major uses 
> of EXI, that is, frequent exchange of tiny documents could 
> suffer from this, because it is typical that such tiny 
> documents are designed very carefully to pinch on bits to 
> maximize efficiency. Considering those balances, we decided 
> to forgo the "version"
> property of Document Information Item of Infoset.
> 
> [1] http://www.w3.org/TR/xml-infoset/
> [2] http://www.w3.org/TR/xml-infoset/#infoitem.document
> [3] http://www.w3.org/TR/2008/REC-xml-20081126/
> 
> -taki
> 
> 
> -----Original Message-----
> From: public-exi-comments-request@w3.org 
> [mailto:public-exi-comments-request@w3.org] On Behalf Of 
> TAMIYA Keisuke
> Sent: Thursday, November 06, 2008 10:22 PM
> To: public-exi-comments@w3.org
> Cc: youenn.fablet@crf.canon.fr; fujisawa.jun@canon.co.jp
> Subject: Question about EXI Draft - XML declaraion
> 
> 
> Dear W3C EXI WG members,
> 
> I have a question about this draft specification.
> 
> EXI does not support the XML declaration - character encoding 
> scheme, standalone, version. (ref. B.1).
> But why does not it support the XML declaration?
> I think "character encoding scheme" is not necessary, but I 
> cannot understand why the "standalone", "version" is not suppoerted.
> 
> Regards,
> Keisuke Tamiya (tamiya.keisuke@canon.co.jp)
> 
> 
> 
> 

Received on Thursday, 22 October 2009 16:08:15 UTC