- From: <bugzilla@wiggum.w3.org>
- Date: Tue, 17 May 2005 19:53:10 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1307 scott_boag@us.ibm.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|chamberl@almaden.ibm.com |scott_boag@us.ibm.com Status|ASSIGNED |NEW ------- Additional Comments From scott_boag@us.ibm.com 2005-05-17 19:53 ------- I can't see any technical issue with the grammar with doing uniform line ending normalization, except the need for pre-processing of the VersionDecl in the case of XQuery, which has to be done in any event. One has to assume the encoding is known for XPath, that that isn't an issue. Since the normalization occurs essentially out-of-band to the syntax parsing process, I don't think there has to be any effect to the rest of the document. Of course, a real world parser would not do two passes... it's just cleaner to specify it this way. I suggest a new section immediately above the section on whitespace, where we pretty much do the same as the XML specifications. I'm not very happy with how the XML 1.0 vs. XML 1.1 wording is done in the first paragraph... this would be easier if we had a proper XML 1.1 named feature, or the like. Any suggestions on ways to better handle this would be much appreciated. ========= A.2.2 End-of-Line Handling The [XPath/XQuery] processor MUST behave as if it normalized all line breaks on input, before parsing. The normalization should be done according to the choice to support [XML 1.0], or [XML 1.1] lexical processing. A.2.2.1 XML 1.0 End-of-Line Handling For [XML 1.0] processing, all of the following MUST be translated to a single #xA character: 1. the two-character sequence #xD #xA 2. any #xD character that is not immediately followed by #xA. A.2.2.2 XML 1.1 End-of-Line Handling For [XML 1.1] processing, all of the following MUST be translated to a single #xA character: 1. the two-character sequence #xD #xA 2. the two-character sequence #xD #x85 3. the single character #x85 4. the single character #x2028 5. any #xD character that is not immediately followed by #xA or #x85. (XQuery-only)The characters #x85 and #x2028 cannot be reliably recognized and translated until the VersionDecl declaration (if present) has been read. ===========
Received on Tuesday, 17 May 2005 19:57:31 UTC