Re: Restrict <CRLF> to the value \r\n ... the instance is <CRLF>\r\n</CRLF> ... error on validation -- why?

The Xerces parser is reporting the value of the "value" attribute to 
Saxon as two spaces. (The debugger also shows a private field indicating 
that the unnormalized value of the attribute is "& CR;& LF;" without the 
spaces.

So it's XML attribute value normalization that's to blame.

If you wrote value="& #13;& #10;" then the value would not be 
normalized; I'm not sure why that isn't true if you use named entity 
references, but I'm sure someone has studied the small print.

Michael Kay
Saxonica


On 24/10/2012 13:55, Costello, Roger L. wrote:
> Hello Folks,
>
> The following schema declares two ENTITIES, one for carriage return and one for line feed. And then it restricts the value of the <CRLF> element by referencing those ENTITIES.
>
> -------------------------------------------------
>                  CRLF.xsd
> -------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE xs:schema [
> <!ENTITY CR "&#13;">
> <!ENTITY LF "&#10;">
> ]>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
>
>      <xs:element name="Test">
>          <xs:complexType>
>              <xs:sequence>
>                  <xs:element name="CRLF" maxOccurs="unbounded">
>                      <xs:simpleType>
>                          <xs:restriction base="xs:string">
>                              <xs:pattern value="&CR;&LF;"/>
>                          </xs:restriction>
>                      </xs:simpleType>
>                  </xs:element>
>              </xs:sequence>
>          </xs:complexType>
>      </xs:element>
>
> </xs:schema>
>
> In an instance document the value of <CRLF> should be a carriage return followed by a line feed, right?
>
> -------------------------------------------------
>                  CRLF.xml
> -------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <Test>
>          <CRLF>&#13;&#10;</CRLF>
> </Test>
>
> However, when I validate that instance document I get this error message:
>
>      The content "\r\n" of element <CRLF> does not match
>      the required simple type. Value "\r\n" contravenes the
>      pattern facet "  " of the type of element CRLF.
>
> That makes no sense. The pattern facet clearly specifies "\r\n"
>
> What am I doing wrong?
>
> /Roger
>
>

Received on Wednesday, 24 October 2012 13:32:56 UTC