Re: Some issues with the IRI document [NFCsecurity-09] from Martin Duerst on 2004-03-21 (public-iri@w3.org from March 2004)

From: Martin Duerst <duerst@w3.org>
Date: Sun, 21 Mar 2004 13:50:57 -0500
To: Simon Josefsson <jas@extundo.com>
Cc: public-iri@w3.org
Message-Id: <4.2.0.58.J.20040321134951.05d31048@localhost>
Hello Simon,

I have definitively closed this issue. I assume this is okay with you.

Regards,    Martin.

>Date: Thu, 26 Jun 2003 16:44:37 -0400
>To: Simon Josefsson <jas@extundo.com>
>From: Martin Duerst <duerst@w3.org>
>Subject: Re: Some issues with the IRI document [NFCsecurity-09]
>Cc: public-iri@w3.org
>
>Hello Simon,
>
>I have looked at your issue again, and found a way to fit it into
>the current draft. In the security section, there currently is
>a paragraph saying:
>
> >>>>
>    Spoofing can occur for various reasons.  A first reason is that
>    normalization expectations of a user or actual normalization when
>    entering an IRI do not match the normalization used on the server
>    side.  Conceptually, this is no different from the problems
>    surrounding the use of case-insensitive web servers.  For example, a
>    popular web page with a mixed case name (http://big.site/
>    PopularPage.html) might be "spoofed" by someone who is able to create
>    http://big.site/popularpage.html.  However, the introduction of
>    character normalization, and of additional mappings for user
>    convenience, may increase the chance for spoofing.
> >>>>
>
>I have appended this to:
>
> >>>>
>    Spoofing can occur for various reasons.  A first reason is that
>    normalization expectations of a user or actual normalization when
>    entering an IRI, or when transcoding an IRI from a legacy encoding,
>    do not match the normalization used on the server side.
>    Conceptually, this is no different from the problems surrounding the
>    use of case-insensitive web servers.  For example, a popular web page
>    with a mixed case name (http://big.site/PopularPage.html) might be
>    "spoofed" by someone who is able to create http://big.site/
>    popularpage.html.  However, the introduction of character
>    normalization, and of additional mappings for user convenience, may
>    increase the chance for spoofing.  Protocols and servers that allow
>    the creation of resources with unnormalized names, and resources with
>    names that are not normalized, are particularly vulnerable to such
>    attacks.  This is an inherent security problem of the relevant
>    protocol, server, or resource, and not specific to IRIs, but
>    mentioned here for completeness.
> >>>>
>
>As I have explained earlier, I still consider this a security issue
>much more for the particular protocol or setup than for IRIs per se,
>and have said so.
>
>I hope this addresses your concerns. I am tentatively closing this
>issue, but any comments or suggestions for improvement are welcome.
>
>
>Regards,    Martin.
>
>
>At 00:09 03/04/17 +0200, Simon Josefsson wrote:
>
>>Martin Duerst <duerst@w3.org> writes:
>>
>> > NFC is only required in the current draft when encoding something
>> > from e.g. the side of a bus, or when transcoding it from a non-
>> > Unicode encoding. This is only to provide a base level of
>> > predictability.
>>
>>Section 2.4 says "IRIs SHOULD be created using Normalization Form C
>>(NFC)."  I don't interprete that to only apply to the scenarios you
>>mention, rather it seem to imply that whenever a IRI is created, by
>>whatever process, NFC should be used.  Perhaps it can be clarified?
>>
>> > Still indeed this could lead to security problems with the
>> > mechanisms you describe above, if there are two users that have user
>> > names that only differ in normalization. But it would seem to me
>> > that in this case, the security issue comes from these mechanisms
>> > (or the actual use with these specific user names) rather than from
>> > IRIs.
>>
>>If this is so, I believe the IRI security considerations should
>>mention this so that people can be aware of it, and abstain from
>>deploying IRIs in systems that behave in this way, since doing so
>>would introduce problems.  Perhaps some properly worded text could
>>be derived from the following strawman?
>>
>>    Whenever fields of an IRI are normalized, the octet representation
>>    is modified.  While this is unavoidable if ambiguities are to be
>>    resolved, it can raise security issues in some situations.  In
>>    particular, if a iuserinfo field is normalized, a security protocol
>>    expecting a certain byte sequence as a username may receive a
>>    different one.  This can lead to interoperability failures, but
>>    also more serious failures in systems, e.g. when the system
>>    performs authorization based on the username.
>>
>>Thanks.
Received on Sunday, 21 March 2004 13:53:14 UTC