W3C home > Mailing lists > Public > public-iri@w3.org > November 2009

phishing in IRIs (was: Re: Using Punicode for host names in IRI -> URI translation; phishing; comparison)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Mon, 23 Nov 2009 21:11:16 +0900
Message-ID: <4B0A7BE4.4000301@it.aoyama.ac.jp>
To: Larry Masinter <masinter@adobe.com>
CC: "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>, Pete Resnick <presnick@qualcomm.com>, Ted Hardie <ted.ietf@gmail.com>
On 2009/11/18 3:56, Larry Masinter wrote:
> These are "strawman" proposals in response to the IAB talk at the IETF meeting last week: knock down if you can.

>   1.  Spoofing
> Secondly, there are a number of concerns raised about spoofing. Of course, spoofing is an issue with just ASCII too,
> example.com vs example.corn  being difficult to distinguish, (never mind example.C0M).

I think that this part of the IAB presentation was mainly to make people 
aware of the issues. The IAB presentation did not contain any 
conclusions on this issue, neither in terms of actual directions for 
solutions nor even in terms of "specs have to address this".

> The observation is that there are many ways in which names can be formed for which there is NO visible distinction between what are separate unicode encodings.
> The main way I think of addressing these are:
>   1.  Visual validation of URIs and IRIs is basically *NOT EFFECTIVE*
 > and that user agents *SHOULD NOT USE* visual validation
 > as the primary way of preventing spoofing.

I'm not exactly sure what you mean by "visual validation". Isn't this 
something the user does, rather than the user agent?

 > Other methods for protecting against phishing *MUST* be used.
 > I think we can point to some of the techniques that browsers currently
 > already deploy as alternatives, without making them normative.

There are activities carried out with IRIs where it's absolutely 
inappropriate to require some protection against phishing. So the above 
MUST has to at least be carefully contained.

Also, it should be noticed that the main attack vector for 
phishing/spoofing are IDNs, not IRIs in general. True, there's some 
potential for attacks also in cases such as

It should also be mentioned that the IDNAbis WG has an explicit 
provision in their charter that phishing in general is out of scope
(from http://www.ietf.org/dyn/wg/charter/idnabis-charter.html):

There are a variety of generally unsolvable problems, notably the
problem of characters that are confusingly similar in appearance (often
known as the "phishing" problem) that are not specifically part of the
scope of the WG although some of the preliminary results of the design
team suggest that the improvements contemplated in the specifications
might mitigate some of the ways in which the current IDNA specifications
can be abused for phishing purposes.

This provision helped the WG to avoid getting into ratholes. I was 
thinking about a similar provision for our charter.

There is one open issue, 
http://www.w3.org/International/iri-edit/#transcodeNFC-103, which (in 
its wider sense) is about when and where to use Unicode normalization. 
The current tendency seems to get rid of the requirement to use NFC is 
certain cases, because this hasn't been implemented.

Other than that, I think phishing/spoofing mainly belongs into the 
security section (where it already is, see 

>   2.  Anyone who prints an IRI on the side of a bus or a matchbook
 > cover has the responsibility of making sure that what they print
 > can be typed in a way that leads to an unambiguous result.
 > Currently this advice only applies to ASCII-only URIs, and the
 > extension to other non-ASCII URIs depends on infrastructure that is > 
NOT currently part of, or mandated by, or appropriate for, the IRI
 > specification.

I don't understand this. Where in RFC 3986 is there such a 
responsibility or advice? In what sense would that advice not apply to 
IRIs? The above point doesn't really have to be made overly explicitly 
in the spec in order to be executed, because people who don't follow it 
will just hurt themselves (the sites they want to be reached won't be 

 > Unfortunately, the implementation advice on how to generate an
 > unambiguously-enterable IRI depends on technology deployment which

Can you tell us what you are referring to here? I have some ideas for 
what you might mean. In some cases, I may agree, in others, I may disagree.

> and nothing we can specify in the IETF will
 > make it happen sooner.  We can give some advice that will mitigate
 > a few of the problems, but so few that making that advice
 > normative isn't actually helpful.

Regards,   Martin.

#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Monday, 23 November 2009 12:12:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:55 GMT