- From: Stephen Farrell <stephen.farrell@cs.tcd.ie>
- Date: Fri, 17 Dec 2004 17:00:57 +0000
- To: Ed Simon <edsimon@xmlsec.com>
- Cc: www-xkms@w3.org
Ed, I generally agree. What I meant that base64 would break interop, as would (the otherwise acceptable) reduction to a single internal space. So how about a scheme with the properties: - Encode as UTF-8 according to stringprep (and someone has to figure out details there maybe) - Case-fold - Reduce whitespace (to zero for interop I guess) As you more-or-less said, I'd expect that that'd be ok with most of the examples already processed and with most code. So, two things to hopefully close this down: - Any objections? - Anyone volunteer to go through stringprep [1], make sure there're no gotchas, and write text? (BTW: Formally, I guess we'd have to refer to the stringprep RFC [2] and not the I-D, but we ought to check against the I-D just in case). Cheers, Stephen. [1] http://www.ietf.org/internet-drafts/draft-hoffman-rfc3454bis-02.txt [2] http://www.ietf.org/rfc/rfc3454.txt Ed Simon wrote: > It seems to me that requiring an XML processor (right?) is going to be > particularly performance-consuming. Plus one has to deal with exactly what > "All shared string values are encoded as XML" means. To me, it means that > the pass phrase MUST be valid XML (eg. > > "<Pass_Phrase xmlns="http://example.com/secrets">my > <Adjective>little</Adjective> > <![CDATA[<]]>secret<![CDATA[>]]>!</Pass_Phrase>" > > ) or else it is NOT a valid pass phrase, AND, therefore, pass phrase tools > must be full-fledged XML parsers capable of dealing with potential attacks > like entity expansion. There is also a contradiction that if one requires > conversion to lower-case, one invalidates XML such as that in my example > because XML names are case-sensitive. It seems to me the constraints are > contradictory. > > I think what was originally intended was something like "encode as UTF-8"; I > expect requiring this would NOT break the interop cases done thus far > because I would guess no one is trying to use pass phrases that are, in > themselves, valid XML. > > Ed > ======================================== > Ed Simon > (613) 726-9645 > edsimon@xmlsec.com > Interested in XML, Web Services, or Security? Visit "www.xmlsec.com". > Now available! "Web Services Security" published by Osborne (ISBN# > 0072224711) > > > -----Original Message----- > From: www-xkms-request@w3.org [mailto:www-xkms-request@w3.org] On Behalf Of > Stephen Farrell > Sent: December 17, 2004 11:01 AM > To: Ed Simon > Cc: www-xkms@w3.org > Subject: Re: Again, confusing 8.1 > > > > Hi Ed, > > I'd agree with both of those, except that both would break interop and its > getting late in the day to do that... > > Stephen. > > Ed Simon wrote: > > >>As the use of XML-sensitive characters is a problem, then can we not, >>and should we not anyway, require that pass phrases be base64-encoded >>when used within XML. In fact, it would seem to be that this would be >>good practice so the pass phrase does not get messed up by XML >>processing whether it contains XML-ese or not. >> >>BTW, I also think trailing and leading whitespace MUST be removed and >>internal whitespace reduced to one space character (not zero). >> >>Ed >>======================================== >>Ed Simon >>(613) 726-9645 >>edsimon@xmlsec.com >>Interested in XML, Web Services, or Security? Visit "www.xmlsec.com". >>Now available! "Web Services Security" published by Osborne (ISBN# >>0072224711) >> >> >>-----Original Message----- >>From: Stephen Farrell [mailto:stephen.farrell@cs.tcd.ie] >>Sent: December 17, 2004 10:42 AM >>To: Ed Simon >>Cc: www-xkms@w3.org >>Subject: Re: Again, confusing 8.1 >> >> >>Hi Ed, >> >>I guess we're all in agreement, but none of us seems to know exactly >>how to write down what we want! >> >>I agree that we have to support non Latin characters - if we didn't >>then nearly everyone in certain parts of the world would use the same >>key (since their entire string would be highly likely to be reduced to >>nothing!) which'd be a bit of a security flaw as well as an I18N-nasty. >> >>If we drop xml encoding (good, let's remove that degree of freedom), >>we're then ok to directly use "<" characters in our strings? >> >>As for the canonical UTF-8, that's what the stringprep RFC does, and >>apparently it involves ~30Kb of object code (from memory, so may be >>wrong there), so it has been found to be complicated. >> >>Stephen. >> >>Ed Simon wrote: >> >> >> >>>I generally agree with Jose and Guillermo's recommendations EXCEPT for >>>the one about filtering UTF-8 characters outside the ASCII32-127. >>>Unless, there is a verifiable case to be made for disallowing >>>non-Latin >> >>characters (eg. >> >> >>>Korean pass phrases) I would not include that possibility. >>>Ultimately, the pass phrase is just '1's and '0's and all we are doing >>>is saying how a human-readable/writable phrase can be consistently >>>converted into binary; that MAY not always mean the end device has to >>>understand Unicode, just binary. (I say MAY because I'm not a mobile >>>device expert, I just want someone who is to say non-ASCII is a >>>problem before we try to accommodate >>>it.) >>> >>>I would drop mention of "XML Encoding" and call it "UTF-8" encoding; >>>not only do I think this is sensible from the outset but it also gets >>>rid of trying to process XMLese like entities etc. I confess that I >>>have one question which is I am not absolutely sure (eg. due to >>>combining sequences) there is always one and only one binary >>>representation for every unique UTF-8-encoded pass phrase; Jose, can >>>you verify that with a W3C UTF-8 expert. A follow-up question would >>>be whether we could use rules to canonicalise the UTF-8 (eg. do not >>>use combining characters) if there is more than one binary representation. >>> >>>Regards, Ed >>>======================================== >>>Ed Simon >>>(613) 726-9645 >>>edsimon@xmlsec.com >>>Interested in XML, Web Services, or Security? Visit "www.xmlsec.com". >>>Now available! "Web Services Security" published by Osborne (ISBN# >>>0072224711) >>> >>> >>>-----Original Message----- >>>From: www-xkms-request@w3.org [mailto:www-xkms-request@w3.org] On >>>Behalf Of Stephen Farrell >>>Sent: December 17, 2004 7:48 AM >>>To: jose.kahan@w3.org >>>Cc: www-xkms@w3.org >>>Subject: Re: Again, confusing 8.1 >>> >>> >>> >>>Jose, >>> >>>I agree that this bit needs more work. A few points. >>> >>>- Do we want to maintain interop with any existing implementations? >>> I believe we do. But I think its fair to assume that most >>> existing code hasn't taken the corner cases into account so we >>> should be ok to make non-interoperable changes for corner cases. >>>- Reducing the keyspace isn't a real issue. English has something >>> like 1.5 bits of entropy per character, so unless you're using >>> really long strings it makes no difference - the space is >>> searchable anyway. >>>- Case folding (H->h) is IMO worthwhile simply to avoid the >>> CAPSLOCK problem. >>>- Some whitespace shrinkage is needed, e.g. "^t" vs " ". >>> Most other specs shrink all consequtive whitspace characters >>> to one space, we currently eat 'em all which is a bit weird >>> but ok. (If Phill's listening maybe he had a reason for that?) >>>- Punctuation character handling. Current spec is weird there. >>> They'd normally be included in the output. >>>- We do have to determine how to handle XML encoding, e.g. of "&", >>> "%20", "<" etc. I've no clue how to properly do that. >>>- We do have to determine how to define and handle control chars. >>> The latter is easy, the former I dunno how to do. >>>- I don't think mobile devices etc is a real issue for us, since >>> input device limitations can be taken into account when the >>> strings are selected/generated. >>>- We have to decide how to handle I18N. I think the current spec >>> is probably broken for countries which don't use Latin-1 >>> characters at all. Again I'm not sure of the right thing here, >>> but there has been some (non-trivial) work done on this for >>> DNS [1], which is being taken up in various security related >>> specs, (and for which there's source code available) so maybe >>> using that is a good idea. >>> >>>Stephen. >>> >>>[1] >>>http://www.ietf.org/internet-drafts/draft-hoffman-rfc3454bis-02.txt >>> >>>Jose Kahan wrote: >>> >>> >>> >>>>Hi folks, >>>> >>>>Per last meeting's action item: >>>> >>>>we will include a test case validating the string2key algorithm in >>> >>>section >>> >>> >>> >>>>8.1,. AI: Guillermo and Jose to generate such test cases >>>> >>>> >>>>After talking with Guillermo, we both found section 8.1 confusing. >>>>This section uses terms that are known in the security field. What is >>>>confusing is how they apply to XKMS. In particular: >>>> >>>>---------- >>>>- Is this algorithm meant to generated a one-time use pass phrase >>>>that can >>> >>>be >>> >>> >>> >>>>read over the phone? >>>> >>>>- All shared string values are encoded as XML >>>> >>>>What is a shared string here? Is it a limited-use string? [242] >>>>proposes >>> >>>a >>> >>> >>> >>>>a user-generated authentication phrase for revoking a public >>>>key: "Help I have revealed my key". However, when looking at section >>> >>>C.2.1, >>> >>> >>> >>>>we find that the 8.1 algorithm was used to convert it "helpih...". >>>> >>>>If this phrase was a shared string, shouldn't it have been converted >>> >>>into >>> >>> >>> >>>>XML, regardless of its content, and then the result converted into >>>>hexa, without dropping spaces, punctuation, etc.? >>>> >>>>What is the meaning of "encoded as XML"? Accentuated characters and >>>>"&'<> symbols encoded as entities (we would not be able to precise >>>>the charset otherwise). Accentuated characters encoded as UTF-8? >>>> >>>>I couldn't find what the spec defines as "shared string" or why 8.1 >>>>has >>> >>>to be >>> >>> >>> >>>>applied always, regardless of who generated the shared secret. >>>> >>>>- All punctuation, space and control characters are removed. >>>> >>>>I can understand why we remove control characters, but I can't >>> >>>understand why >>> >>> >>> >>>>we remove punctuation, spaces. We can read them on the phone easily, >>>>I >>> >>>think. >>> >>> >>> >>>>Moreover, by simplifying thus the pass phrase, aren't we making it >>>>more vulnerable to oracle attacks? >>>> >>>>- All upper case characters in the Latin-1 alphabet (A-Z) are >>>>converted to >>> >>>lower case. >>> >>> >>> >>>>No other characters, including accented characters are converted >>>> >>>>Why must uppercase be converted into lowercase? One can read them >>>>easily >>> >>>on >>> >>> >>> >>>>the phone I think :) >>>>It's not clear what is done to the other characters or what was the >>>>rationale. From reading this, It seems that if my name is spelled Jos >>>>, >>> >>>it >>> >>> >>> >>>>would be converted to jos . >>>> >>>>This convertion also reduces the keyspace, and imo makes it more >>> >>>vulnerable >>> >>> >>> >>>>to oracle attacks. >>>>-------- >>>> >>>>IMO, what we need to define is: >>>> >>>>- What is a limited-used shared secret >>>>- When does the 8.1 algorithm need to be applied (make it an explicit >>> >>>reference >>> >>> >>> >>>>in concerned sections) >>>>- Decide if all such secrets should be speakable on the phone or be >>>>typed >>> >>>with >>> >>> >>> >>>>a device that doesn't allow all those characters; use it as a >>>>rationale for removing punctuation, etc. >>>>- Remove the ambiguities of the algorithm in section 8.1. >>>>- Decide if we need to define a minimum size for the shared secret > > string. > >>>What >>> >>> >>> >>>>is its relationship with entropy? >>>> >>>>In my opinion, what we are looking for is for an algorithm to >>>>canonicalize shared-secret strings (that they be limited or not) that >>>>produces an XML valid string. I would propose the following one: >>>> >>>>1. Remove all the control characters from the string >>>> --> reason: I feel that those characters could cause problems and >>>>they >>> >>>could >>> >>> >>> >>>> not be typed al the time >>>>2. Encode the string in UTF-8 >>>> This will take into account accentuated characters >>>> --> reason: it's the only way to convert those characters into >>>>portable ASCII 3. Put the Hexa equvalent for each of those >>>>characters, >>> >>>using lowercase >>> >>> >>> >>>> letters. Note that here we don't remove any punctuation symbols. We >>>>are >>> >>>just >>> >>> >>> >>>> converting them. >>>> >>>>This would convert Jos& into [4a] [6f] [73] [26] [c3] [89] >>>> >>>>XKMS could be used by mobile devices too. If for some reason, we >>>>believe that it will be too much of overhead to make UTF-8 >>>>convertions, we can just suppress all the characters above 127 ASCII. >>>>Another reason could be if the user has to type those characters in a >>>>phone and he doesn't have the full character set available. It could >>>>be that an operator at the other end cannot read a decoded >>>>UTF-8 string if it's stored as such. This is some rationale as to why >>>>reduce the strings to ASCII 32-127. >>>> >>>>I don't know what would be the rationale for converting the strings >>>>to lower-case and suppressing all the punctuation symbols. >>>> >>>>Tommy had written: >>>> >>>> >>>> >>>> >>>> >>>>>Four implementors have independantly implemented the "Limited Use >>>>>Shared Secret" algorithm in a way that interoperates so I have not >>>>>seen a break down yet. However, both the spec and the existing >>>>>shared secret distribution points (at least my service) avoid cases >>>>>that lead to ambigous interpretation. >>>> >>>> >>>>This seems to imply that the places where the 8.1 are already >>>>identified. I think it would be good to make this explicit in the spec. >>>> >>>> >>>> >>>> >>>> >>>>>>Maybe change the spec to only allow a smaller subset of strings to >>>>>>become keys >>>> >>>> >>>>>I'm in favor of this option, provided that the recommendations in >>>>>Section 10.4 can still be followed. >>>> >>>> >>>>Ditto :) >>>> >>>>-jose >>> >>> >>> >>> >> >> > > >
Received on Friday, 17 December 2004 16:56:18 UTC