- From: Stephen Farrell <stephen.farrell@cs.tcd.ie>
- Date: Fri, 17 Dec 2004 16:00:47 +0000
- To: Ed Simon <edsimon@xmlsec.com>
- Cc: www-xkms@w3.org
Hi Ed, I'd agree with both of those, except that both would break interop and its getting late in the day to do that... Stephen. Ed Simon wrote: > As the use of XML-sensitive characters is a problem, then can we not, and > should we not anyway, require that pass phrases be base64-encoded when used > within XML. In fact, it would seem to be that this would be good practice > so the pass phrase does not get messed up by XML processing whether it > contains XML-ese or not. > > BTW, I also think trailing and leading whitespace MUST be removed and > internal whitespace reduced to one space character (not zero). > > Ed > ======================================== > Ed Simon > (613) 726-9645 > edsimon@xmlsec.com > Interested in XML, Web Services, or Security? Visit "www.xmlsec.com". > Now available! "Web Services Security" published by Osborne (ISBN# > 0072224711) > > > -----Original Message----- > From: Stephen Farrell [mailto:stephen.farrell@cs.tcd.ie] > Sent: December 17, 2004 10:42 AM > To: Ed Simon > Cc: www-xkms@w3.org > Subject: Re: Again, confusing 8.1 > > > Hi Ed, > > I guess we're all in agreement, but none of us seems to know exactly how to > write down what we want! > > I agree that we have to support non Latin characters - if we didn't then > nearly everyone in certain parts of the world would use the same key (since > their entire string would be highly likely to be reduced to nothing!) > which'd be a bit of a security flaw as well as an I18N-nasty. > > If we drop xml encoding (good, let's remove that degree of freedom), we're > then ok to directly use "<" characters in our strings? > > As for the canonical UTF-8, that's what the stringprep RFC does, and > apparently it involves ~30Kb of object code (from memory, so may be wrong > there), so it has been found to be complicated. > > Stephen. > > Ed Simon wrote: > > >>I generally agree with Jose and Guillermo's recommendations EXCEPT for >>the one about filtering UTF-8 characters outside the ASCII32-127. >>Unless, there is a verifiable case to be made for disallowing non-Latin > > characters (eg. > >>Korean pass phrases) I would not include that possibility. >>Ultimately, the pass phrase is just '1's and '0's and all we are doing >>is saying how a human-readable/writable phrase can be consistently >>converted into binary; that MAY not always mean the end device has to >>understand Unicode, just binary. (I say MAY because I'm not a mobile >>device expert, I just want someone who is to say non-ASCII is a >>problem before we try to accommodate >>it.) >> >>I would drop mention of "XML Encoding" and call it "UTF-8" encoding; >>not only do I think this is sensible from the outset but it also gets >>rid of trying to process XMLese like entities etc. I confess that I >>have one question which is I am not absolutely sure (eg. due to >>combining sequences) there is always one and only one binary >>representation for every unique UTF-8-encoded pass phrase; Jose, can >>you verify that with a W3C UTF-8 expert. A follow-up question would >>be whether we could use rules to canonicalise the UTF-8 (eg. do not >>use combining characters) if there is more than one binary representation. >> >>Regards, Ed >>======================================== >>Ed Simon >>(613) 726-9645 >>edsimon@xmlsec.com >>Interested in XML, Web Services, or Security? Visit "www.xmlsec.com". >>Now available! "Web Services Security" published by Osborne (ISBN# >>0072224711) >> >> >>-----Original Message----- >>From: www-xkms-request@w3.org [mailto:www-xkms-request@w3.org] On >>Behalf Of Stephen Farrell >>Sent: December 17, 2004 7:48 AM >>To: jose.kahan@w3.org >>Cc: www-xkms@w3.org >>Subject: Re: Again, confusing 8.1 >> >> >> >>Jose, >> >>I agree that this bit needs more work. A few points. >> >>- Do we want to maintain interop with any existing implementations? >> I believe we do. But I think its fair to assume that most >> existing code hasn't taken the corner cases into account so we >> should be ok to make non-interoperable changes for corner cases. >>- Reducing the keyspace isn't a real issue. English has something >> like 1.5 bits of entropy per character, so unless you're using >> really long strings it makes no difference - the space is >> searchable anyway. >>- Case folding (H->h) is IMO worthwhile simply to avoid the >> CAPSLOCK problem. >>- Some whitespace shrinkage is needed, e.g. "^t" vs " ". >> Most other specs shrink all consequtive whitspace characters >> to one space, we currently eat 'em all which is a bit weird >> but ok. (If Phill's listening maybe he had a reason for that?) >>- Punctuation character handling. Current spec is weird there. >> They'd normally be included in the output. >>- We do have to determine how to handle XML encoding, e.g. of "&", >> "%20", "<" etc. I've no clue how to properly do that. >>- We do have to determine how to define and handle control chars. >> The latter is easy, the former I dunno how to do. >>- I don't think mobile devices etc is a real issue for us, since >> input device limitations can be taken into account when the >> strings are selected/generated. >>- We have to decide how to handle I18N. I think the current spec >> is probably broken for countries which don't use Latin-1 >> characters at all. Again I'm not sure of the right thing here, >> but there has been some (non-trivial) work done on this for >> DNS [1], which is being taken up in various security related >> specs, (and for which there's source code available) so maybe >> using that is a good idea. >> >>Stephen. >> >>[1] >>http://www.ietf.org/internet-drafts/draft-hoffman-rfc3454bis-02.txt >> >>Jose Kahan wrote: >> >> >>>Hi folks, >>> >>>Per last meeting's action item: >>> >>> we will include a test case validating the string2key algorithm in >> >>section >> >> >>> 8.1,. AI: Guillermo and Jose to generate such test cases >>> >>> >>>After talking with Guillermo, we both found section 8.1 confusing. >>>This section uses terms that are known in the security field. What is >>>confusing is how they apply to XKMS. In particular: >>> >>>---------- >>>- Is this algorithm meant to generated a one-time use pass phrase that >>>can >> >>be >> >> >>> read over the phone? >>> >>>- All shared string values are encoded as XML >>> >>> What is a shared string here? Is it a limited-use string? [242] >>>proposes >> >>a >> >> >>> a user-generated authentication phrase for revoking a public >>> key: "Help I have revealed my key". However, when looking at section >> >>C.2.1, >> >> >>> we find that the 8.1 algorithm was used to convert it "helpih...". >>> >>> If this phrase was a shared string, shouldn't it have been converted >> >>into >> >> >>> XML, regardless of its content, and then the result converted into >>>hexa, without dropping spaces, punctuation, etc.? >>> >>> What is the meaning of "encoded as XML"? Accentuated characters and >>>"&'<> symbols encoded as entities (we would not be able to precise >>>the charset otherwise). Accentuated characters encoded as UTF-8? >>> >>> I couldn't find what the spec defines as "shared string" or why 8.1 >>>has >> >>to be >> >> >>> applied always, regardless of who generated the shared secret. >>> >>>- All punctuation, space and control characters are removed. >>> >>> I can understand why we remove control characters, but I can't >> >>understand why >> >> >>> we remove punctuation, spaces. We can read them on the phone easily, >>>I >> >>think. >> >> >>> Moreover, by simplifying thus the pass phrase, aren't we making it >>>more vulnerable to oracle attacks? >>> >>>- All upper case characters in the Latin-1 alphabet (A-Z) are >>>converted to >> >>lower case. >> >> >>> No other characters, including accented characters are converted >>> >>> Why must uppercase be converted into lowercase? One can read them >>>easily >> >>on >> >> >>> the phone I think :) >>> It's not clear what is done to the other characters or what was the >>>rationale. From reading this, It seems that if my name is spelled Jos >>>, >> >>it >> >> >>> would be converted to jos . >>> >>> This convertion also reduces the keyspace, and imo makes it more >> >>vulnerable >> >> >>> to oracle attacks. >>>-------- >>> >>>IMO, what we need to define is: >>> >>>- What is a limited-used shared secret >>>- When does the 8.1 algorithm need to be applied (make it an explicit >> >>reference >> >> >>> in concerned sections) >>>- Decide if all such secrets should be speakable on the phone or be >>>typed >> >>with >> >> >>> a device that doesn't allow all those characters; use it as >>> a rationale for removing punctuation, etc. >>>- Remove the ambiguities of the algorithm in section 8.1. >>>- Decide if we need to define a minimum size for the shared secret string. >> >>What >> >> >>> is its relationship with entropy? >>> >>>In my opinion, what we are looking for is for an algorithm to >>>canonicalize shared-secret strings (that they be limited or not) that >>>produces an XML valid string. I would propose the following one: >>> >>>1. Remove all the control characters from the string >>> --> reason: I feel that those characters could cause problems and >>>they >> >>could >> >> >>> not be typed al the time >>>2. Encode the string in UTF-8 >>> This will take into account accentuated characters >>> --> reason: it's the only way to convert those characters into >>>portable ASCII 3. Put the Hexa equvalent for each of those characters, >> >>using lowercase >> >> >>> letters. Note that here we don't remove any punctuation symbols. We >>>are >> >>just >> >> >>> converting them. >>> >>>This would convert Jos& into [4a] [6f] [73] [26] [c3] [89] >>> >>>XKMS could be used by mobile devices too. If for some reason, we >>>believe that it will be too much of overhead to make UTF-8 >>>convertions, we can just suppress all the characters above 127 ASCII. >>>Another reason could be if the user has to type those characters in a >>>phone and he doesn't have the full character set available. It could >>>be that an operator at the other end cannot read a decoded >>>UTF-8 string if it's stored as such. This is some rationale as to why >>>reduce the strings to ASCII 32-127. >>> >>>I don't know what would be the rationale for converting the strings to >>>lower-case and suppressing all the punctuation symbols. >>> >>>Tommy had written: >>> >>> >>> >>> >>>>Four implementors have independantly implemented the "Limited Use >>>>Shared Secret" algorithm in a way that interoperates so I have not >>>>seen a break down yet. However, both the spec and the existing >>>>shared secret distribution points (at least my service) avoid cases >>>>that lead to ambigous interpretation. >>> >>> >>>This seems to imply that the places where the 8.1 are already >>>identified. I think it would be good to make this explicit in the spec. >>> >>> >>> >>> >>>>>Maybe change the spec to only allow a smaller subset of strings to >>>>>become keys >>> >>> >>>>I'm in favor of this option, provided that the recommendations in >>>>Section 10.4 can still be followed. >>> >>> >>>Ditto :) >>> >>>-jose >> >> >> >> > > >
Received on Friday, 17 December 2004 15:56:07 UTC