white space in xsd:hexBinary

Dear XML Schema working Group,

From reading the latest XML Schema spec (which is a big improvement over the previous one!) it seems that it is not possible to put white spaces inside an xsd:hexBinary. I read the text here 

  http://www.w3.org/TR/xmlschema11-2/#hexBinary

"[the lexical space of] hexBinary is the same as that recognized by the regular 
  expression '([0-9a-fA-F]{2})*'."

I was looking for confirmation that that is the correct reading first of all. There is a white space collapse facet which I suppose is meant to remove leading and trailing spaces, but not spaces inside the number.

Then secondly I was looking to see if there were reasons this was done like this. After all a hexBinary could and usually is a very very long string, and so it is likely to be difficult to read if it cannot be cut up a little bit. It is also very likely that white spaces should enter into such a long number by mistake as people copy and paste information from one system to another, in what could be normal human processing tasks. 

I imagine this rule would make sense if it were possible in some XML formats to use the xsd:hexBinary datatype and have it be followed by a set of hexBinaries each separated by a space.

But in formats that use this datatype that are RDF driven, such as RDF/XML, Turtle, RDFa and so on, this is not the case. Those formats require there to only be 1 binary, so there is really nothing that the spaces can separate. 

To help put some context on this the WebID Incubator group requires users who need a global login to publish their public key at their WebID Profile (and traditionally all certificates like to do this using hex encoded formats). The profile is described here

 http://webid.info/spec
or the latest editor's draft
 https://dvcs.w3.org/hg/WebID/raw-file/tip/spec/index-respec.html#turtle

The Turtle example in the editor's draft gives as example the following

----------8<----------------------------------------------
@prefix cert: <http://www.w3.org/ns/auth/cert#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<#me> a foaf:Person;
 foaf:name "Bob";
 foaf:knows <https://example.edu/p/Alois#MSc>;
 foaf:weblog <http://bob.example/blog>;
 cert:key [ a cert:RSAPublicKey;
   rdfs:label "made on 23 November 2011 on my laptop";
   cert:modulus "cb24ed85d64d794b69c701c186acc059501e856000f661c93204d8380e07191c5c8b368d2ac32a428acb970398664368dc2a867320220f755e99ca2eecdae62e8d15fb58e1b76ae59cb7ace8838394d59e7250b449176e51a494951a1c366c6217d8768d682dde78dd4d55e613f8839cf275d4c8403743e7862601f3c49a6366e12bb8f498262c3c77de19bce40b32f89ae62c3780f5b6275be337e2b3153ae2ba72a9975ae71ab724649497066b660fcf774b7543d980952d2e8586200eda4158b014e75465d91ecf93efc7ac170c11fc7246fc6ded79c37780000ac4e079f671fd4f207ad770809e0e2d7b0ef5493befe73544d8e1be3dddb52455c61391a1"^^xsd:hexBinary;
   cert:exponent 65537 ;
  ] .
----------8<----------------------------------------------

But it just seems quite likely that people will end up putting white spaces in there somewhere. Should parsers reject those immediately? And if so why?

Social Web Architect
http://bblfish.net/

Received on Monday, 16 January 2012 17:16:06 UTC