W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2012

RE: Restrict <CRLF> to the value \r\n ... the instance is <CRLF>\r\n</CRLF> ... error on validation -- why?

From: Peter Geraghty <Peter.Geraghty@tracegroup.com>
Date: Wed, 24 Oct 2012 15:03:35 +0100
Message-ID: <F650D4E37270CF489B55D9417F768126019E8236@PLC-EXCH-SRV.tracegroup.com>
To: <xmlschema-dev@w3.org>
With this example you also have to consider the XML processor's handling
of end-of-line characters in the instance document.

 

http://www.w3.org/TR/REC-xml/#sec-line-ends 

 

Pete Geraghty

 

From: Sandy Gao [mailto:sandygao@ca.ibm.com] 
Sent: 24 October 2012 14:53
To: xmlschema-dev@w3.org
Subject: Re: Restrict <CRLF> to the value \r\n ... the instance is
<CRLF>\r\n</CRLF> ... error on validation -- why?

 

> So it's XML attribute value normalization that's to blame.

Yes. See section 3.3.3 in the XML spec, especially the example at the
end of that section.

http://www.w3.org/TR/REC-xml/#AVNormalize

Thanks,
Sandy Gao
IBM Canada
(1-905) 413-3255 T/L 313-3255
sandygao@ca.ibm.com <mailto:sandygao@ca.ibm.com> 

 Michael Kay ---10/24/2012 09:36:01 AM---The Xerces parser is reporting
the value of the "value" attribute to Saxon as two spaces. (The debu

From: Michael Kay <mike@saxonica.com>
To: xmlschema-dev@w3.org, 
Date: 10/24/2012 09:36 AM
Subject: Re: Restrict <CRLF> to the value \r\n ... the instance is
<CRLF>\r\n</CRLF> ... error on validation -- why?

________________________________




The Xerces parser is reporting the value of the "value" attribute to 
Saxon as two spaces. (The debugger also shows a private field indicating

that the unnormalized value of the attribute is "& CR;& LF;" without the

spaces.

So it's XML attribute value normalization that's to blame.

If you wrote value="& #13;& #10;" then the value would not be 
normalized; I'm not sure why that isn't true if you use named entity 
references, but I'm sure someone has studied the small print.

Michael Kay
Saxonica


On 24/10/2012 13:55, Costello, Roger L. wrote:
> Hello Folks,
>
> The following schema declares two ENTITIES, one for carriage return
and one for line feed. And then it restricts the value of the <CRLF>
element by referencing those ENTITIES.
>
> -------------------------------------------------
> CRLF.xsd
> -------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE xs:schema [
> <!ENTITY CR "&#13;">
> <!ENTITY LF "&#10;">
> ]>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
>
> <xs:element name="Test">
> <xs:complexType>
> <xs:sequence>
> <xs:element name="CRLF" maxOccurs="unbounded">
> <xs:simpleType>
> <xs:restriction base="xs:string">
> <xs:pattern value="&CR;&LF;"/>
> </xs:restriction>
> </xs:simpleType>
> </xs:element>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
>
> </xs:schema>
>
> In an instance document the value of <CRLF> should be a carriage
return followed by a line feed, right?
>
> -------------------------------------------------
> CRLF.xml
> -------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <Test>
> <CRLF>&#13;&#10;</CRLF>
> </Test>
>
> However, when I validate that instance document I get this error
message:
>
> The content "\r\n" of element <CRLF> does not match
> the required simple type. Value "\r\n" contravenes the
> pattern facet " " of the type of element CRLF.
>
> That makes no sense. The pattern facet clearly specifies "\r\n"
>
> What am I doing wrong?
>
> /Roger
>
>





image001.gif
(image/gif attachment: image001.gif)

Received on Wednesday, 24 October 2012 14:04:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:16:02 UTC