Re: Some issues with the IRI document [NFCsecurity-09] from Martin Duerst on 2003-05-08 (public-iri@w3.org from May 2003)

From: Martin Duerst <duerst@w3.org>
Date: Thu, 08 May 2003 17:41:08 -0400
To: Simon Josefsson <jas@extundo.com>
Cc: public-iri@w3.org
Message-Id: <4.2.0.58.J.20030506180318.060222b8@localhost>

Hello Simon,

Sorry for the delay in moving this issue forward.

At 00:09 03/04/17 +0200, Simon Josefsson wrote:

>Martin Duerst <duerst@w3.org> writes:
>
> > NFC is only required in the current draft when encoding something
> > from e.g. the side of a bus, or when transcoding it from a non-
> > Unicode encoding. This is only to provide a base level of
> > predictability.
>
>Section 2.4 says "IRIs SHOULD be created using Normalization Form C
>(NFC)."  I don't interprete that to only apply to the scenarios you
>mention, rather it seem to imply that whenever a IRI is created, by
>whatever process, NFC should be used.  Perhaps it can be clarified?

I see what you mean. It definitely should be qualified. The intent
is NOT to use NFC in all cases. That's why the SHOULD is there.
If there is some underlying data that is not in NFC, then
normalizing to NFC when creating the IRI is a bad idea.
I haven't yet worded this out, but I think that's what the
document should say. Would that remove your concerns?

> > Still indeed this could lead to security problems with the
> > mechanisms you describe above, if there are two users that have user
> > names that only differ in normalization. But it would seem to me
> > that in this case, the security issue comes from these mechanisms
> > (or the actual use with these specific user names) rather than from
> > IRIs.
>
>If this is so, I believe the IRI security considerations should
>mention this so that people can be aware of it, and abstain from
>deploying IRIs in systems that behave in this way,

I think it's not a problem of systems, but a problem of
the individual IRIs. I.e. if there are two user names that
only differ by normalization, then there is no reason not
to use IRIs for all the other users.

>since doing so
>would introduce problems.  Perhaps some properly worded text could
>be derived from the following strawman?
>
>    Whenever fields of an IRI are normalized, the octet representation
>    is modified.  While this is unavoidable if ambiguities are to be
>    resolved, it can raise security issues in some situations.  In
>    particular, if a iuserinfo field is normalized, a security protocol
>    expecting a certain byte sequence as a username may receive a
>    different one.  This can lead to interoperability failures, but
>    also more serious failures in systems, e.g. when the system
>    performs authorization based on the username.

I'm currently not convinced we need this text (assuming the
clarifications I outlined at the start of this mail).
IRIs are not normalized at will. If they are normalized
when transcoding from legacy encodings, then the problem
is in the legacy encoding, because the legacy encoding isn't
able to distinguish e.g. the two user names.
In addition, as I have said, creating two user names that
differ only by normalization is a security issue for whoever
created these user names, not an issue for IRIs.
Also, a system that performs authorization based on
user name (alone) is a serious security issue, again not
the fault of IRIs.

Regards,    Martin.

Received on Thursday, 8 May 2003 18:48:59 UTC