- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 6 Jun 2007 22:20:18 +0000 (UTC)
On Fri, 10 Mar 2006, Alexey Feldgendler wrote: > > Does the current version of the spec define what happens to elements > with duplicate ID values? No. It's something we should consider for fixes to DOM3 Core, though. > The problem of duplicate ID isn't just another issue where it's nice to > have some well-defined error recovery just for uniformity. There are > cases when duplicate IDs should be viewed as a security concern. > > Consider a script which augments the HTML page after it has been parsed > by attaching event listeners to elements in the DOM tree, inserting new > nodes into the tree etc. This is common practice, for example, for many > web-based WYSIWYG editors. In this scenario, any method the script uses > for identificaation of the DOM nodes subject to augmentation is > vulnerable to possible spoofing by user-supplied content present on the > same page. > > For example, imagine a script which finds a button by ID and attaches an > event listener to it. A possible markup looks like this: > > <div> > ...blog entry body... > </div> > <button id="addtomemories">Add this entry to memories</button> > <script> > document.getElementById('addtomemories').addEventListener('click', > doSomeNiceAJAX); > </script> > > So, a malicious blog author can make the following entry: > > I have found a <a href="#" id="addtomemories">cool website</a>. > > Depending on how the browser handles duplicate IDs, any of the following > unwanted effects may occur, or both: > 1. Clicking the link in the blog entry adds the entry to memories list > of the reader. > 2. Clicking the real "Add this entry to memories" button does nothing. > > One can think of other examples, possibly more dangerous. Other methods > of identification (by tag name, by class, by CSS selector as proposed > recently) are also vulnerable. > > This kind of attack is hard to circumvent through use of HTML cleaners > because id="addtomemories" looks like an innocent attribute, like an > anchor for navigation. It's not that hard to avoid. You can whitelist what attributes are allowed (e.g. only attribute consisting of "comment" followed by the comment number followed by 1 to 10 characters in the range a-z). > Preventing such attacks by a HTML cleaner would require either making a > full list of all "forbidden" IDs, class names etc, or imposing Draconian > rules upon user-supplied content, completely disallowing such useful > attributes like id and class. I'm not really convinced there's that much use in user-supplied IDs and classes, but the rules needn't be that draconian. The server could automatically prepend the commentN string to IDs and classes. To be safe, a server's cleaning code must whitelist everything -- elements, attribute names, attribute values, element contents, etc. It's not trivial, but that's no excuse for not doing it. > Necessary but not sufficient. Duplicate IDs aren't caught by a > validating parser, so custom code is needed to enforce many of the > requirements. For example, if one was trying to ensure that all IDs are > unique, then the ID values within the user-supplied code would have to > be checked for duplicates among them, too. This is already the case, yes. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 6 June 2007 15:20:18 UTC