RE: More comments on XML 1.0 from Francois Yergeau on 2002-02-17 (xml-editor@w3.org from January to March 2002)

From: Francois Yergeau <FYergeau@alis.com>
Date: Sun, 17 Feb 2002 17:43:33 -0500
To: MURATA Makoto <mmurata@trl.ibm.co.jp>, xml-editor@w3.org
Message-ID: <F7D4BDA0E5A1D14B99D32C022AEB73660EB2D1@alis-2k.alis.domain>

MURATA Makoto wrote:
> - E26 in the list of errata for the second edition 
>   overshadows E3 by mistake.

Good catch, I've put it in the potential errata list, to be reviewed by the
WG.

> - E19 overshadows E7 by mistake.

It doesn't.  E19 is wrong in that the wording says "first paragraph" when
the impacted
sentence is actually in the <b>second</b> paragraph. E7 impacts the first
paragraph,
therefore no collision.  I think fixing E19 will be enough, but the WG will
decide.

> - In the revision to 4.3.3 by E27, we have "any irregular code unit
>   sequences".  However, if we have code unit sequences rather than
>   byte sequences, we have already successfully interpreted the parsed
>   entity as UTF-8.  I think that it should be replaced with "any 
>   irregular byte sequences".

You have been misled by the misleading Unicode terminology. "Code unit"
means the units used to encode "code points" (character numbers).  In the
case of UTF-8, the code unit is a byte.  Although misleading, it's good to
have "code unit" here because it exactly matches what Unicode says.

> - Section 6.  "whose canonical (UCS-4) code value" is misleading.
>   One could argue that the UCS-4 code value after character
>   normalization is mentioned.

Agreed.  Now in the potential errata list.

> - Subsection 2.2.  When is "the time this document was prepared"?
>   Publication of the first or second edition, or publication of the
>   last erratum?

It means exactly what it says.  For the 2nd edition, it means when that
edition was prepared.

> - The first para of 3.2.2.  "should" is vague; "it is to behave 
>   as though...." in the second para is also vague.  Are XML 
> processors 
>   allowed to ignored the default?  In particular, are non-validating 
>   processors allowed to ignore default values declared in internal 
>   subsets?  My understanding is as follows: 
> 
>   (1)  validating processors MUST use all default values;
>   (2)  non-validating processors MUST use all default values
>        in internal DTD subsets (except for parameter entities); and
>   (3)  non-validating processors SHOULD use other default values.

Errr, yes it's a bit vague.  But what about 

"When an XML processor encounters an element without a specification for an
attribute for which it has read a default value declaration, it is to behave
as though the attribute were present with the declared default value."

?  Now in the potential errata list.

> - What does "match" in the last para of 4.2.2 mean?

From 1.2 : Two strings or names being compared must be identical.  4.2.2
tells us to remove extraneous spaces before checking this.

> - The last para of 4.3.3 still contains "octet" rather than "byte".

Arghhh!  Now in the potential errata list.

Thanks a lot for your careful reading.  I'll go over your other mails later,
I'm out of time now.

Regards,

-- 
François

Received on Sunday, 17 February 2002 17:44:58 UTC