Re: [Serial] I18N WG last call comments

Hello, François.

François Yergeau wrote on 2004-06-14 09:28:11 PM:
>Henry Zongaro a écrit :
>>      In [1], Martin Duerst submitted the following comment on the Last 
>> Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf 
>> the I18N Working Group.
>>>[4] This only defines serialization into bytes. In some contexts
>>>   (e.g. Databases, in-program,...), serialization into a stream
>>>   of characters is also important. The spec should specify how
>>>   this is done.
>>      The XSL and XQuery Working Groups discussed the comment.  The 
>> groups noted that there is an analogy in parsing XML documents.  XML 
>> and XML 1.1 parsed entities are defined as sequences of character code 
>> points, each in some encoding.  Though it is common practice to parse 
>> documents that have already been decoded into a sequence of characters, 

>> the XML 1.0 and XML 1.1 Recommendations do not describe the actions of 
>> XML processor in those terms.
>>      Based on this analogy, the working groups decided that it was not 
>> appropriate for Serialization to specify normatively how to serialize 
>> a stream of characters.  The working groups did decide to add a note to 

>> Section 3 of Serialization indicating that a processor could provide an 

>> option that would permit the fourth phase of serialization (Encoding) 
>> be skipped.
>We are not really satisfied with this resolution and would like to 
>request further clarification.  In particular, conformance when one is 
>actually serializing to characters instead of bytes is not clear at all 
>to us.  Allowing this but not normatively is very strange, one is left 
>to wonder what would be the conformance status of an implementation that 
>*only* serializes to characters (because that's all that is required in 
>a given context).
>> [1] 

     Thank you for your response.  The intent of the note was to indicate 
that an implementer might supply such a feature as an extension, because 
it is often required, but that such a feature is explicitly beyond the 
scope of the specification.  An implementer might supply anything as an 
extension, and doesn't require permission to do so - we would just like to 
mention this one as a useful extension.

     Here is the text of the note that I'm proposing:

Note: Serialization is only defined in terms of encoding the result as a 
stream of bytes. However, a processor may provide an option that allows 
the encoding phase to be skipped, so that the result of serialization is a 
stream of Unicode characters. The effect of any such option is 
implementation-defined, and a processor is not required to support such an 

     I don't believe there is a question of conformance here. 
Serialization to characters is explicitly a usage that is beyond the 
specification, and the behaviour of a processor that supplies such a 
feature is unspecified.  Similarly, many XML parsers are able to parse 
characters in addition to parsing encoded characters, but the conformance 
of such parsers is not in question in spite of the fact that this feature 
is an extension that is not described by the XML 1.0 or 1.1 

     Does the I18N Working Group feel it would be better not to include 
such a note at all?


Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

Received on Tuesday, 15 June 2004 06:11:09 UTC