W3C home > Mailing lists > Public > www-xml-blueberry-comments@w3.org > October 2002

Issue: inconsistency of S production and treatment of line endings.

From: Amelia A Lewis <alewis@tibco.com>
Date: 16 Oct 2002 10:05:48 -0400
To: www-xml-blueberry-comments@w3.org
Message-Id: <1034777148.12728.8.camel@xerom>

In XML 1.0, the S production includes:

S := #x9 | #xA | #xD | #x20

In the discussion of handling of line endings, it is stated that #xD #xA
is normalized to #xA, as is #xD alone.

No specification seems to be made for the order of processing.  Because
#xD is included in the space production, a processor might tokenize
before normalization of line endings.

In the 1.1 draft, the S production is unchanged.  However, the handling
of line endings now includes normalization of #xD #x85, #x85, and
#x2028.

This creates an inconsistency with XML 1.0 which needs to be addressed. 
I can see three possible resolutions:

1) Add language requiring line ending normalization before tokenization
(impose processing order requirements).  For consistency, redefine S to
remove #xD, which cannot appear after line ending normalization.

2) Change the S production to include all line ending characters before
normalization:

S := #x9 | #xA | #xD | #x20 | #x85 | #x2028

3) Do not change line endings in 1.1.

Amy!
-- 
Amelia A. Lewis
Architect, TIBCO/Extensibility, Inc.
alewis@tibco.com
Received on Wednesday, 16 October 2002 10:06:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 22 March 2009 12:11:47 GMT