RE: Restrict <CRLF> to the value \r\n ... the instance is <CRLF>\r\n</CRLF> ... error on validation -- why?

With this example you also have to consider the XML processor's handling
of end-of-line characters in the instance document.

 

http://www.w3.org/TR/REC-xml/#sec-line-ends 

 

Pete Geraghty

 

From: Sandy Gao [mailto:sandygao@ca.ibm.com] 
Sent: 24 October 2012 14:53
To: xmlschema-dev@w3.org
Subject: Re: Restrict <CRLF> to the value \r\n ... the instance is
<CRLF>\r\n</CRLF> ... error on validation -- why?

 

> So it's XML attribute value normalization that's to blame.

Yes. See section 3.3.3 in the XML spec, especially the example at the
end of that section.

http://www.w3.org/TR/REC-xml/#AVNormalize

Thanks,
Sandy Gao
IBM Canada
(1-905) 413-3255 T/L 313-3255
sandygao@ca.ibm.com <mailto:sandygao@ca.ibm.com> 

 Michael Kay ---10/24/2012 09:36:01 AM---The Xerces parser is reporting
the value of the "value" attribute to Saxon as two spaces. (The debu

From: Michael Kay <mike@saxonica.com>
To: xmlschema-dev@w3.org, 
Date: 10/24/2012 09:36 AM
Subject: Re: Restrict <CRLF> to the value \r\n ... the instance is
<CRLF>\r\n</CRLF> ... error on validation -- why?

________________________________




The Xerces parser is reporting the value of the "value" attribute to 
Saxon as two spaces. (The debugger also shows a private field indicating

that the unnormalized value of the attribute is "& CR;& LF;" without the

spaces.

So it's XML attribute value normalization that's to blame.

If you wrote value="& #13;& #10;" then the value would not be 
normalized; I'm not sure why that isn't true if you use named entity 
references, but I'm sure someone has studied the small print.

Michael Kay
Saxonica


On 24/10/2012 13:55, Costello, Roger L. wrote:
> Hello Folks,
>
> The following schema declares two ENTITIES, one for carriage return
and one for line feed. And then it restricts the value of the <CRLF>
element by referencing those ENTITIES.
>
> -------------------------------------------------
> CRLF.xsd
> -------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE xs:schema [
> <!ENTITY CR "&#13;">
> <!ENTITY LF "&#10;">
> ]>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
>
> <xs:element name="Test">
> <xs:complexType>
> <xs:sequence>
> <xs:element name="CRLF" maxOccurs="unbounded">
> <xs:simpleType>
> <xs:restriction base="xs:string">
> <xs:pattern value="&CR;&LF;"/>
> </xs:restriction>
> </xs:simpleType>
> </xs:element>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
>
> </xs:schema>
>
> In an instance document the value of <CRLF> should be a carriage
return followed by a line feed, right?
>
> -------------------------------------------------
> CRLF.xml
> -------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <Test>
> <CRLF>&#13;&#10;</CRLF>
> </Test>
>
> However, when I validate that instance document I get this error
message:
>
> The content "\r\n" of element <CRLF> does not match
> the required simple type. Value "\r\n" contravenes the
> pattern facet " " of the type of element CRLF.
>
> That makes no sense. The pattern facet clearly specifies "\r\n"
>
> What am I doing wrong?
>
> /Roger
>
>

Received on Wednesday, 24 October 2012 14:04:12 UTC