[whatwg] input element's value should not be sanitized during parsing

On Fri, 11 Mar 2011, Jonas Sicking wrote:
> On Tue, Dec 28, 2010 at 11:46 PM, Ian Hickson <ian at hixie.ch> wrote:
> > On Mon, 20 Sep 2010, Mounir Lamouri wrote:
> >>
> >> With the current specification, these two elements will not have the
> >> same value:
> >> <input value="foo&#13;bar" type='hidden'>
> >> <input type='hidden' value="foo&#13;bar">
> >
> > Yes they will. The attribute order has no effect. Elements are created 
> > by the parser with their attributes already set:
> >
> > # When the steps below require the UA to create an element for a token in
> > # a particular namespace, the UA must create a node implementing the interface
> > # appropriate for the element type corresponding to the tag name of the
> > # token in the given namespace (as given in the specification that defines
> > # that element, e.g. for an a element in the HTML namespace, this
> > # specification defines it to be the HTMLAnchorElement interface), with
> > # the tag name being the name of that element, with the node being in the
> > # given namespace, and with the attributes on the node being those given
> > # in the given token.
> >  -- http://www.whatwg.org/specs/web-apps/current-work/complete.html#create-an-element-for-the-token
> Except that I don't think this is how any implementation actually works. 
> Nor do I have any desire to write the implementation this way since it 
> means duplicating a lot of code. I'd have to add code which implemented 
> attribute behavior both in some special code path triggered during 
> element creation, as well as code to react to attribute changes 
> triggered by attribute changes in setAttribute/removeAttribute.
> So far this hasn't been needed and the parsing code basically just calls 
> setAttribute. Unless there are really good reasons to change this I'd 
> like to avoid it. So far I haven't heard of any such reasons.

The spec is defined such that attribute setting during element creation is 
order-agnostic. I believe this is consistent with what authors expect (in 
part based on the confusion I've seen when authors run into cases where 
that isn't the case). How you implement that is somewhat orthogonal to how 
it is specced; if there are specific things that are hard to implement, 
I'm happy to discuss them specifically if you like.

> > On Tue, 21 Sep 2010, Boris Zbarsky wrote:
> >>
> >> Where does it say that it's atomic? ?I don't see that anywhere (and 
> >> in fact, the "create an element" code in the Gecko parser is most 
> >> decidedly non-atomic). ?Now maybe the spec intends this to be an 
> >> atomic operation; if so it needs to say that.
> >
> > The operation it describes is a single operation: create a node. It 
> > describes various constraints on that operation, one of which is that 
> > the node have the various tokenised attributes set. I don't understand 
> > how creating a node could be anything other than atomic -- either it 
> > exists or it does not.
> You're expecting several operations to happen at the same time. We could 
> certainly manually insert the attributes and their value into the 
> datastructure inside the element which stores the attribute name/value 
> pairs. However at some point we need to update all of the state that 
> these values drive. Things like sticking elements into id-hashes, 
> storing the calculated type of an input, calculating the effective URI 
> of an image, etc. This involves several separate pieces of state and so 
> can't happen "all at the same time".

Sure. When those things happen is defined by the spec too.

> > On Tue, 21 Sep 2010, Jonas Sicking wrote:
> >>
> >> Also, it would mean that the following two pieces of code behaves differently:
> >>
> >> inp = document.createElement("input");
> >> inp.setAttribute("value", "foo\nbar");
> >> inp.setAttribute("type", "hidden");
> >>
> >> and
> >>
> >> inp = document.createElement("input");
> >> inp.setAttribute("type", "hidden");
> >> inp.setAttribute("value", "foo\nbar");
> >>
> >> This does not seem desirable.
> >
> > I can't argue that it's desireable, but it's how the Web works, as I 
> > understand it.
> Gecko doesn't exhibit this behavior and I don't know of any sites that 
> doesn't work in Gecko because of this.

On Wed, 30 Mar 2011, Mounir Lamouri wrote:
> FWIW, it does. The first inp.value is 'foobar' while the second is 'foo 
> bar'. See: 
> http://software.hixie.ch/utilities/js/live-dom-viewer/saved/900
> Though, I do not think this is related to the initial issue which is 
> about setting attributes while creating the element from the parser.

Right, the behaviour is different when the parser does it. This is per 
spec, and seems to match what Firefox does.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 14 June 2011 14:00:47 UTC