Re: [Serial] I18N WG last call comments

Hello Henry,

[I have copied the I18N IG as well as a list on your side to reduce
the change that this gets lost. I suggest to do that with all messages
related to last-call discussions.]

At 16:44 04/08/30 -0400, Henry Zongaro wrote:
>Hello, Martin.
>      In [1], you submitted the following comment on the Last Call Working
>Draft of Serialization on behalf of the I18N Working Group:
>[20] 6.4 HTML Output Method: Writing Character Data: "Certain characters,
>    specifically the control characters #x7F-#x9F, are legal in XML but
>    not in HTML. ... The processor may signal the error, but is not
>    required to do so.": Please change this to require the processor
>    to produce an error.
> >>
>      The XSL and XML Query Working Groups made the decision recorded at
>[2], but the I18N Working Group raised an objection [3] to that decision.
>      A subsequent e-mail exchange [4-9] ensued between yourself, Michael
>Kay and Andrea Vine.  The final message in the thread came from Michael
>      As the XSL and XQuery Working Groups have not heard whether the
>additional discussion satisfactorily clarified the issue for the I18N
>Working Group, we will assume that the issue has been resolved to their
>satisfaction.  If that is not the case, please advise us of any additional
>points requiring clarification.

The I18N WG (Core TF) has had a look at this issue (again!).
We have decided that we need to object to your current resolution.

In particular, we want to point out that while Michael Kay has listed
other cases where it is possible to create non-valid HTML with the
HTML serialization method (see,
all these issues are higher-level than the issue at hand. They are
all related to document structure and/or are not used by default.

Trying to address issues related to document structure would mean
that serialization would have to deal with HTML versioning info
and configuration options for such versioning, which would
considerably complicate the specification. This is not at all
the case for disallowing code points in the C1 range, which is
independent of HTML versioning and at a much more basic level.

Also, as Micheal mentioned, character maps can be used to circumvent
any kinds of output restrictions. It is much better to make the
production of clean, correct output the default (in particular
when this can be easily achieved), and have some mechanism for
circumvention, than to tolerate crappy output from the start.
The misused of codepoints in the C1 range has been a long-standing
problem, and we greatly hope that XQuery and XSLT can help to
solve it rather than contribute to production of more garbage.

Regards,     Martin.

P.S.: For the record, I would also like to point out that the I18N WG
has officially disagreed on this issue at
The following discussion has brought up some more details, but there
is no indication that the I18N WG would have changed its opinion.
I think that it is clearly inappropriate in such cases to say, as
you do above "we will assume that the issue has been resolved to their
satisfaction". [I think doing such a thing is appropriate when you have
fully (or maybe partially) addressed our comment.]

>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

Received on Tuesday, 14 September 2004 06:40:30 UTC