W3C home > Mailing lists > Public > xmlschema-dev@w3.org > March 2009

RE: different whitespace-collapse behaviour of parsers

From: Michael Kay <mike@saxonica.com>
Date: Fri, 13 Mar 2009 08:29:16 -0000
To: "'Dieter Guthmann'" <mailing-deg@bup-nbg.de>, <xmlschema-dev@w3.org>
Message-ID: <F412FB52AF68433398AA35492DCC99D2@Sealion>
The specification is a little bit less formal than one might like:

replace
    All occurrences of #x9 (tab), #xA (line feed) and #xD (carriage return)
are replaced with #x20 (space) 
collapse
    After the processing implied by replace, contiguous sequences of #x20's
are collapsed to a single #x20, and leading and trailing #x20's are removed.


and I guess one could argue for an interpretation that says a character
can't be a "leading #x20" unless it is followed by something - but it seems
a bit far-fetched to me. I think Liquid XML Studio is out on a limb here.
But I've raised bug 6695 to propose a clarification.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: xmlschema-dev-request@w3.org 
> [mailto:xmlschema-dev-request@w3.org] On Behalf Of Dieter Guthmann
> Sent: 12 March 2009 16:35
> To: xmlschema-dev@w3.org
> Subject: different whitespace-collapse behaviour of parsers
> 
> Hi,
> 
> I've tested a few XML-Parsers/Validators and discovered that 
> the restriction "whiteSpace=collapse" [1] is interpreted in 
> two different ways.
> 
> "Liquid XML Studio 2009" treats a whitespace-only string 
> within a tag not as trailing/leading whitespace [1]:
> "<tag>    \n   </tag>" will be transformed in "<tag> </tag>" before
> validation (or it seems so),
> whereas the other parsers I've tested (Altova XML Spy, 
> XMLmind XML Editor, W3C XSV) will transform
> "<tag>    \n   </tag>" to "<tag></tag>"
> 
> Which behaviour is the correct one? See below for a schema example.
> 
> Best Regards,
> Dieter Guthmann
> 
> [1] 
> <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#rf-whiteSpace>
> 
> Below an Example:
> ------- begin mysschema.xsd -------
> <?xml version="1.0" encoding="utf-8" ?>
> <xs:schema elementFormDefault="qualified"
> xmlns:xs="http://www.w3.org/2001/XMLSchema">
>   <xs:simpleType name="nocontent">
>     <xs:restriction base="xs:string">
>       <xs:whiteSpace value="collapse" />
>       <xs:pattern value="" />
>       <!-- This line above will not work in  Liquid XML... 
> replace "value" with "[ ]" and it will work -->
>     </xs:restriction>
>   </xs:simpleType>
>   <xs:element name="customerdatabase">
>     <xs:complexType>
>       <xs:sequence>
>         <xs:element ref="customer" />
>       </xs:sequence>
>     </xs:complexType>
>   </xs:element>
>   <xs:element name="customer">
>     <xs:complexType>
>       <xs:simpleContent>
>         <xs:extension base="nocontent">
>           <xs:attribute name="name" type="xs:string" />
>         </xs:extension>
>       </xs:simpleContent>
>     </xs:complexType>
>   </xs:element>
> </xs:schema>
> ------- end myschema.xsd -------
> 
> ------- begin myxmlfile.xml -------
> <?xml version="1.0" encoding="UTF-8"?>
> <customerdatabase 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:noNamespaceSchemaLocation="myschema.xsd">
>   <customer name="test">
>   </customer>
> </customerdatabase>
> ------- end myxmlfile.xml -------
> 
> 
> 
Received on Friday, 13 March 2009 08:30:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:15:11 GMT