W3C home > Mailing lists > Public > xmlschema-dev@w3.org > June 2009

Clarification re: token data type and token values

From: G. Ken Holman <gkholman@CraneSoftwrights.com>
Date: Sun, 14 Jun 2009 14:37:11 -0700
Message-Id: <7.0.1.0.2.20090614141921.02654cd0@CraneSoftwrights.com>
To: <xmlschema-dev@w3.org>
Hi folks,

I'm trying to better understand:

   http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#token

It seems to me that the use of the singular "token" is misleading and 
contradictory in the first sentence "token represents tokenized strings".

I think of a tokenized string as a set of tokens (plural) separated 
by singleton spaces.  Each non-space-sequence of characters being a 
token.  Certainly this would be in line with NMTOKENS being a list of 
space-separated NMTOKEN values that don't have spaces.

In XML 1.0 attribute normalization there are only singleton spaces, 
but where spaces are present they are separating members of a 
list.  XSLT offers string normalization to create a string that is of 
type token and happens to satisfy normalizedString ... though the 
function does not produce any normalizedString value that contains 
multiple contiguous spaces.

If "token" were an adjective, then "token string" would be "a string 
of tokens", so perhaps it is meant to be used here as an adjective 
(yet the data type "normalizedString" doesn't work well as an 
adjective in "a normalized string string").

The schema specification clearly describes the value is an atomic 
string that contains singleton spaces.  So an application acting on 
"token" would act on the entire string value in which it would find 
space-separated values of non-space character sequences.  The 
specification clearly does not state the token value is a list of 
tokens.  So the atomic value passed to the application may have spaces in it.

So, what would use cases be in guiding users to use "token" in 
preference to "normalizedString"?  When are singleton spaces in a 
value passed to an application of more importance than arbitrary 
sequences of contiguous spaces in a value, when that value is an 
atomic value not a list of non-space sequences?

Is there anywhere in the W3C specifications where the definition of 
the word "token" in the token data type distinguished from the 
definition of the word "token" in the name token and name tokens data 
types (NMTOKEN/NMTOKENS)?

Thanks for any guidance you may have on interpreting this.

. . . . . . . . . . . . . Ken

--
XQuery/XSLT/XSL-FO hands-on training - Los Angeles, USA 2009-06-08
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/x/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman@CraneSoftwrights.com
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/x/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal
Received on Sunday, 14 June 2009 21:45:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:15:12 GMT