W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > July to September 2001

RE: spaces in value fields

From: Biron,Paul V <Paul.V.Biron@kp.org>
Date: Thu, 20 Sep 2001 15:59:20 -0700
Message-Id: <8904C60CACA7D51191BC00805FEAAF43BDF3@crdc-exch-7.crdc.kp.org>
To: www-xml-schema-comments@w3.org
Cc: "Miller, Scott" <smiller@vignette.com>, "'Ashok Malhotra'" <ashokma@microsoft.com>, "Penick, Thomas" <tpenick@vignette.com>
Actually, I read the spec as saying that leading/trailing whitespace are
allowed in element/attribute values declared to be of a simple type, date in
particular (of course, leading/trailing whitespace is allowed in string as
well, but it is not discarded prior to validation).  To see this, you have
to read both the datatypes and structures spec.

Datatypes Section 4.3.6 says [1]:
	whiteSpace is applicable to all atomic and list datatypes.
	For all atomic datatypes other than string (and types derived
	by restriction from it) the value of whiteSpace is collapse and
	cannot be changed by a schema author; for string the value of
	whiteSpace is preserve; for any type derived by restriction
	from string the value of whiteSpace can be any of the three legal
	values. 
By this, date (double, etc.) have a value of collapse for whiteSpace.

Structures Section 3.1.4 says [2]

	[Definition:] The normalized value of an element or attribute
	information item is an ·initial value· whose white space, if any,
	has been normalized according to the value of the whiteSpace
	facet
http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/datatypes.html
	of the simple type definition used in its ·validation·: 
	preserve 
		No normalization is done, the value is the normalized value 
	replace 
		All occurrences of #x9 (tab), #xA (line feed) and #xD
(carriage
		return) are replaced with #x20 (space). 
	collapse 
		Subsequent to the replacements specified above under
replace,
		contiguous sequences of #x20s are collapsed to a single
#x20,
		and initial and/or final #x20s are deleted. 
It is the normalized value (not the initial value) that is validated.

Therefore, date (double, etc.) that have leading/trailing whitespace are
perfectly valid.

pvb

References
[1] http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace
[2]
http://www.w3.org/TR/xmlschema-1/#section-White-Space-Normalization-during-V
alidation
> -----Original Message-----
> From:	Ashok Malhotra [SMTP:ashokma@microsoft.com]
> Sent:	Thursday, September 20, 2001 12:39 PM
> To:	Penick, Thomas; www-xml-schema-comments@w3.org
> Cc:	Miller, Scott
> Subject:	RE: spaces in value fields
> 
> You originally asked about decimal and date values.  If you eliminated
> the spaces in your example, these would be legal.  The general case
> needs a great deal more explanation.  Please see:
> http://www.w3.org/TR/xmlschema-1/ section 3.1.4.
> Ashok
> 
> 	-----Original Message----- 
> 	From: Penick, Thomas 
> 	Sent: Thu 9/20/2001 11:41 AM 
> 	To: Ashok Malhotra; www-xml-schema-comments@w3.org 
> 	Cc: Miller, Scott 
> 	Subject: RE: spaces in value fields
> 	
> 	
> 
> 	A blanket statement of "spaces are not allowed in simple values"
> would
> 	eliminate possibilities like:
> 	
> 	<stringValue>String Value 1</stringValue>
> 	
> 	
> 	Is this invalid also?
> 	
> 	
> 	Thanks,
> 	Tom
> 	
> 	
> 	
> 	
> 	-----Original Message-----
> 	From: Ashok Malhotra [mailto:ashokma@microsoft.com]
> 	Sent: Thursday, September 20, 2001 1:37 PM
> 	To: Penick, Thomas; www-xml-schema-comments@w3.org
> 	Cc: Miller, Scott
> 	Subject: RE: spaces in value fields
> 	
> 	
> 	You asked"
> 	"Are spaces valid in value fields?"
> 	I do not believe spaces are allowed in simple values.
> 	Ashok
> 	
> 	        -----Original Message-----
> 	        From: Penick, Thomas
> 	        Sent: Thu 9/20/2001 10:40 AM
> 	        To: 'www-xml-schema-comments@w3.org'
> 	        Cc: Miller, Scott
> 	        Subject: FW: spaces in value fields
> 	       
> 	       
> 	        We've encountered a problem that we believe is due to
> our XML
> 	files having spaces in the value fields.  Example:
> 	        <doubleValue>1.0</doubleValue>
> 	        as opposed to:
> 	        <doubleValue>   1.0   </doubleValue>
> 	        
> 	        and
> 	        
> 	        <dateValue>2000-03-31T13:20:00.000Z</dateValue>
> 	        vs.
> 	        <dateValue> 2000-03-31T13:20:00.000Z </dateValue>
> 	        
> 	        The parser throws an exception because it thinks the
> data is
> 	invalid.  For the above example it would think the data was
> actually:
> 	        
> 	        .0.0
> 	        
> 	         for the doubleValue
> 	        
> 	        and
> 	        
> 	        000-03-31T13:20:00.000ZZ
> 	        
> 	        for the date value
> 	         
> 	        
> 	        Note that it dropped the first character and added one
> to the
> 	end.  This is consistent.
> 	        
> 	        We have verified that we can parse data that previously
> caused
> 	an exception by restarting the parse.  It just seems that after
> time the
> 	spaces eventually cause an exception.
> 	        
> 	        We are using Apache Xerces 1.4.3 on Win2k.
> 	        
> 	        Are spaces valid in value fields?
> 	        
> 	        
> 	        
> 	        Vignette is the leading provider of integrated content
> 	applications used by the most successful organizations to
> interact
> 	online with their customers, employees, and partners.
> 	
> 	        
> 	
> 	
Received on Thursday, 20 September 2001 20:49:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:51 GMT