- From: Henry Story <henry.story@bblfish.net>
- Date: Sat, 21 Jan 2012 22:06:21 +0100
- To: WebID XG <public-xg-webid@w3.org>, Liste SW-W3C <semantic-web@w3.org>
So if one can summarise the discussion on the XML Schame mailing list, on the issue of spaces in xsd:hexBinary which I think we are likely to end up finding in the real world and indeed currently do find in Tim Berner's Lee's foaf file $ curl http://www.w3.org/People/Berners-Lee/card ... @prefix cert: <http://www.w3.org/ns/auth/cert#> . ... <#i> cert:key [ a cert:RSAPublicKey; cert:modulus "d7 a0 e9 1e ... cc d1 e4 12 ab..."^^xsd:hexBinary ( which is probably just a simple oversight by timbl, as he quickly switched from the cert:hex datatype that did allow the notation above) (1) There were a few calls to start an investigation to see if anything really dramatically bad would happen if whitespaces were allowed. + Henry Thomson argued for this [1] + Noa Mendelson asks that one should ask around to see if changes would have a bad consequences [2] + Michal Kay is against [3] + Noah Mendelsohn seems to think it's too much work given the stage the spec is at [4] (2) It is indeed not currently legal to put white spaces in the hexBinary BUT... there is a bit of wiggle room. How to interpret such strings if one sticks closely to the current specification of xml-schema-2 [6] was laid out very clearly by C. M. Sperberg-McQueen [7]. What he says requires closer looking into, but perhaps the following is something give a reason to look more carefully. Depending on what type of processing you do of your XML, earlier layers of your XML could remove the white space. Such processing does happen as for example the following is legal <cert:key> <rdfs:label>made on 23 November 2011 on my laptop</rdfs:label> <cert:modulus>0F<!--* hi, mom! *-->B7</cert:modulus> <cert:exponent> 65537 </cert:exponent> </cert:key> The above is legal XML, but that type of processing does not happen in Turtle. So one could argue that there is a difference, and that for example Turtle should be thought of as incorporating certain steps that are not in XML. For example it coulddo the normalisation and remove all white space. Sperber-McQueen was thinking that this had been ruled out in the RDF camp, but perhaps that was only the rdf/xml camp, and perhaps things have changed since then. The saxon parser could even do such a preprocessing step to remove those white spaces explains Michael Kay [8] [1] Henry Thomson: http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0019.html [2] Noa Mendleson http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0022.html [3] Miachel Kay http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0024.html [4] http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0025.html [6] http://www.w3.org/TR/xmlschema-2/ [7] C. M. Sperberg-McQueen's answer to how one should interpret such a binary http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0023.html [8] http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0017.html On 16 Jan 2012, at 18:26, Henry Story wrote: > On 16 Jan 2012, at 17:40, Dave Reynolds wrote: > >> That regex and the associated EBF seems unambiguous to me, no spaces >> between hexOctets. I see no wriggle room :) > > yes, I'd like to know why it is defined like that, and if that needs to be constraining on formats such as RDF that have other restrictions such as only allowing one binary to appear in an xsd:hexBinary string. After all the RDF version could say: we concatenate all binaries into one big binary, since that is the only interpretation that could be meant by someone who had entered white spaces. > > I wrote their group an e-mail to check > > http://lists.w3.org/Archives/Public/www-xml-schema-comments/2012JanMar/0011.html > > > Henry > >> >> Dave > > Social Web Architect > http://bblfish.net/ > Social Web Architect http://bblfish.net/
Received on Saturday, 21 January 2012 21:07:07 UTC