- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 08 May 2003 17:41:08 -0400
- To: Simon Josefsson <jas@extundo.com>
- Cc: public-iri@w3.org
Hello Simon, Sorry for the delay in moving this issue forward. At 00:09 03/04/17 +0200, Simon Josefsson wrote: >Martin Duerst <duerst@w3.org> writes: > > > NFC is only required in the current draft when encoding something > > from e.g. the side of a bus, or when transcoding it from a non- > > Unicode encoding. This is only to provide a base level of > > predictability. > >Section 2.4 says "IRIs SHOULD be created using Normalization Form C >(NFC)." I don't interprete that to only apply to the scenarios you >mention, rather it seem to imply that whenever a IRI is created, by >whatever process, NFC should be used. Perhaps it can be clarified? I see what you mean. It definitely should be qualified. The intent is NOT to use NFC in all cases. That's why the SHOULD is there. If there is some underlying data that is not in NFC, then normalizing to NFC when creating the IRI is a bad idea. I haven't yet worded this out, but I think that's what the document should say. Would that remove your concerns? > > Still indeed this could lead to security problems with the > > mechanisms you describe above, if there are two users that have user > > names that only differ in normalization. But it would seem to me > > that in this case, the security issue comes from these mechanisms > > (or the actual use with these specific user names) rather than from > > IRIs. > >If this is so, I believe the IRI security considerations should >mention this so that people can be aware of it, and abstain from >deploying IRIs in systems that behave in this way, I think it's not a problem of systems, but a problem of the individual IRIs. I.e. if there are two user names that only differ by normalization, then there is no reason not to use IRIs for all the other users. >since doing so >would introduce problems. Perhaps some properly worded text could >be derived from the following strawman? > > Whenever fields of an IRI are normalized, the octet representation > is modified. While this is unavoidable if ambiguities are to be > resolved, it can raise security issues in some situations. In > particular, if a iuserinfo field is normalized, a security protocol > expecting a certain byte sequence as a username may receive a > different one. This can lead to interoperability failures, but > also more serious failures in systems, e.g. when the system > performs authorization based on the username. I'm currently not convinced we need this text (assuming the clarifications I outlined at the start of this mail). IRIs are not normalized at will. If they are normalized when transcoding from legacy encodings, then the problem is in the legacy encoding, because the legacy encoding isn't able to distinguish e.g. the two user names. In addition, as I have said, creating two user names that differ only by normalization is a security issue for whoever created these user names, not an issue for IRIs. Also, a system that performs authorization based on user name (alone) is a serious security issue, again not the fault of IRIs. Regards, Martin.
Received on Thursday, 8 May 2003 18:48:59 UTC