- From: Anli Shundi <ashundi@tibco.com>
- Date: Fri, 05 Sep 2003 17:21:06 -0400
- To: winkowski@mitre.org, xmlschema-dev@w3.org
The parser will convert line endings to #xA but you can still specify the #xD character through character escaping: 
 So \n would match all 'normal line endings' as normalized by the parser. If the data has an #xD it's because the author didn't want it to be normalized/considered as a line ending ? Anli Shundi TIBCO Software Inc. www.tibco.com > -----Original Message----- > From: xmlschema-dev-request@w3.org > [mailto:xmlschema-dev-request@w3.org]On Behalf Of winkowski@mitre.org > Sent: Friday, September 05, 2003 4:28 PM > To: xmlschema-dev@w3.org > Subject: Re: Platform-indendent way of specifying line separator > > > > If end-of-line sequences are normlaized to a single newline > character (#xA) > then I am confused by XML schema regular expressions > http://www.w3.org/TR/xmlschema-2/#regexs. [Section F.1.1 Character Class > Escapes, includes regular expressions for \n the newline > character line-feed > (#xA) and \r the return character (#xD) as well as the unicode seperator > Category Escape (Z, Zs, Zl, Zp). However why even have these if the > end-of-line handling in XML 1.0 or 1.1 normalizes these to > line-feed (#xA)?] > > Also in http://www.unicode.org/reports/tr13/tr13-9.html section 4 > Recommendations is states that "the Unicode Standard defines two > unambiguous > separator characters, Paragraph Separator (PS = 202916) and Line Separator > (LS = 202816). In Unicode text, the PS and LS characters should be used > wherever the desired function is unambiguous. Otherwise, the following > specifies how to cope with an NLF [new line function] when converting from > other character sets to Unicode, when interpreting characters in text, and > when converting from Unicode to other character sets.... If you > do know the > exact usage of any NLF, then convert it to LS or PS. " > > So the the The Unicode Newline Guidelines reccomend using line seperator > (LS, #x2028) but as we have seen XML 1.0 uses the line-feed (#xA). > > In reviewing XML 1.0, XML Schema, and the Unicode Newline Guidelines > together there seems to be a mismatch. Can someone could rationalize these > discrepancies. Does the schema regexpr \n indeed match a end-of-line > sequence on all platforms? > > - Dan Winkowski > > > >
Received on Friday, 5 September 2003 17:58:35 UTC