W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2012

white space in xsd:hexBinary

From: Henry Story <henry.story@bblfish.net>
Date: Mon, 16 Jan 2012 18:15:26 +0100
Message-Id: <CEF0CFA5-7013-4BAE-88BF-03B76ED1FE68@bblfish.net>
To: www-xml-schema-comments@w3.org
Dear XML Schema working Group,

From reading the latest XML Schema spec (which is a big improvement over the previous one!) it seems that it is not possible to put white spaces inside an xsd:hexBinary. I read the text here 


"[the lexical space of] hexBinary is the same as that recognized by the regular 
  expression '([0-9a-fA-F]{2})*'."

I was looking for confirmation that that is the correct reading first of all. There is a white space collapse facet which I suppose is meant to remove leading and trailing spaces, but not spaces inside the number.

Then secondly I was looking to see if there were reasons this was done like this. After all a hexBinary could and usually is a very very long string, and so it is likely to be difficult to read if it cannot be cut up a little bit. It is also very likely that white spaces should enter into such a long number by mistake as people copy and paste information from one system to another, in what could be normal human processing tasks. 

I imagine this rule would make sense if it were possible in some XML formats to use the xsd:hexBinary datatype and have it be followed by a set of hexBinaries each separated by a space.

But in formats that use this datatype that are RDF driven, such as RDF/XML, Turtle, RDFa and so on, this is not the case. Those formats require there to only be 1 binary, so there is really nothing that the spaces can separate. 

To help put some context on this the WebID Incubator group requires users who need a global login to publish their public key at their WebID Profile (and traditionally all certificates like to do this using hex encoded formats). The profile is described here

or the latest editor's draft

The Turtle example in the editor's draft gives as example the following

@prefix cert: <http://www.w3.org/ns/auth/cert#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<#me> a foaf:Person;
 foaf:name "Bob";
 foaf:knows <https://example.edu/p/Alois#MSc>;
 foaf:weblog <http://bob.example/blog>;
 cert:key [ a cert:RSAPublicKey;
   rdfs:label "made on 23 November 2011 on my laptop";
   cert:modulus "cb24ed85d64d794b69c701c186acc059501e856000f661c93204d8380e07191c5c8b368d2ac32a428acb970398664368dc2a867320220f755e99ca2eecdae62e8d15fb58e1b76ae59cb7ace8838394d59e7250b449176e51a494951a1c366c6217d8768d682dde78dd4d55e613f8839cf275d4c8403743e7862601f3c49a6366e12bb8f498262c3c77de19bce40b32f89ae62c3780f5b6275be337e2b3153ae2ba72a9975ae71ab724649497066b660fcf774b7543d980952d2e8586200eda4158b014e75465d91ecf93efc7ac170c11fc7246fc6ded79c37780000ac4e079f671fd4f207ad770809e0e2d7b0ef5493befe73544d8e1be3dddb52455c61391a1"^^xsd:hexBinary;
   cert:exponent 65537 ;
  ] .

But it just seems quite likely that people will end up putting white spaces in there somewhere. Should parsers reject those immediately? And if so why?

Social Web Architect
Received on Monday, 16 January 2012 17:16:06 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:50:12 UTC