- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Wed, 01 Aug 2001 22:06:31 -0600
- To: Rich Salz <rsalz@zolera.com>
- Cc: www-xml-schema-comments@w3.org
At 2001-07-27 13:07, Rich Salz wrote: >As for canonical form, I don't see why adding fourteen internal spaces >per line is noticeably better than not doing so, but I don't care all >that much. No one seems to care that much. (The internal spaces are intended to make life marginally easier for humans who end up having to hand-check this stuff for some reason, but that's a fairly remote prospect in real life. So -- no one much cares.) >As far as the equal sign padding, I have a much stronger position. The >padding is required. The RFC is quite clear, and padding is a very >different subject from whitespace, where there is significant history of >leniency. > >Among the packages with which I am familiar, Python, OpenSSL, and >OpenLDAP (dating back to the first UMich distributions) all require the >padding. If you make it optional, then you have supersetted the spec in >a fairly powerful way, and it would be misleading to still call it >base64. Just to clarify: no one proposes to say that equals signs mean anything different in base64Binary data than they do in data encoded according to RFC 2045. The string "a=b" won't be correct base64 data, no matter what XML Schema does. The question is solely this: if I put the string "a=b" into an element declared as having datatype base64Binary, is it (a) an XML Schema type error (which a conforming XML Schema processor must detect and report)? or (b) an application error (the XML Schema processor having handed the data off to a base64 decoder, which actually does the barfing)? The question arose -- well, why? I asked it, I think, because neither the current XML Schema spec, nor RFC 2045, say that you have to (or should, or even might) raise an error if an equals sign appears before the end of the data. RFC 2045 says only "the occurrence of any '=' characters may be taken as evidence that the end of the data has been reached (without truncation in transit)." It does NOT say "so if more characters in the base64 alphabet are encountered, it might be appropriate to raise a warning" or anything of the sort. What do existing implementations do with a string like "abc=de=="? Do they reject it, or do they treat it as identical to "abc=", i.e. as an encoding of 01101001 10110111 00011100? >There is an even stronger argument: what is the "canonical" form? I can >easily deal with whitespace -- ignore it, as the spec says. But which >of the following are legal base64 encodings of foo? > Zm9vCg > Zm9vCg= > Zm9vCg== > Zm9vCg=== > Zm9vCg====== (6 ='s) > >If padding can be elided, why can't it be added? > >Keep it clear, follow the spec, don't break installed code: leave the >padding as the RFC says. I think I'm hearing you say that yes, you think it's worthwhile for XML Schema processors to check that the equal signs are where they ought to be in correct data, and nowhere else (so that "abc=de==" raises a type error right away). Thanks for the input. -CMSMcQ
Received on Thursday, 2 August 2001 00:05:29 UTC