[whatwg] DOMTokenList feedback from Ian Hickson on 2009-07-31 (public-whatwg-archive@w3.org from July 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 31 Jul 2009 01:38:44 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0907310129210.6420@hixie.dreamhostps.com>
On Mon, 20 Jul 2009, Sylvain Pasche wrote:
> 
> 1) What's the reason for preserving whitespace when a DOMTokenList 
> method is changing the attribute?

As a general rule, I try to make the APIs as minimally invasive as 
possible. Whenever we have failed to do this, we end up confusing authors 
-- for example, look at the confusion that has been cause by the .style 
attribute reserisalising the underlying CSS instead of just preserving it.


> 2) If preserving whitespace is not important, what about normalizing 
> whitespace during mutation?
> 
> By normalizing whitespace, I mean splitting tokens (keeping unique 
> ones), doing the DOMTokenList add/remove/toggle operation, and joining 
> tokens together separated by a whitespace.

If you pre-split the tokens, I guess you could keep pointers into 
the underlying string around to make editing the string faster. In 
general, though, these strings are so short, that I'd expect this to more 
or less be a wash either way.


On Wed, 22 Jul 2009, Anne van Kesteren wrote:
> On Mon, 13 Jul 2009 05:16:19 +0200, Ian Hickson <ian at hixie.ch> wrote:
> > On Mon, 15 Jun 2009, Adam Roben wrote:
> >> Should methods of element.classList treat their arguments
> >> case-insensitively in quirks mode? I think they should. This should be
> >> mentioned in the spec.
> >
> > I've clarified that DOMTokenList is always case-sensitive. We don't 
> > want to be adding more quirk-mode behaviours. Using quirks mode is not 
> > conforming (i.e. it's not a supported behaviour).
> 
> Implementation-wise that does not seem nice if you want to use the same 
> optimized object when dealing with style sheets or 
> getElementsByClassName(). Alternatively we could require I suppose that 
> in quirks mode class names are normalized to be lowercase or some such 
> and keep getElementsByClassName() and classList case-sensitive...

I don't follow. Which object are you going to reuse?


On Mon, 27 Jul 2009, Jonas Sicking wrote:
> 
> It's certainly doable to do this at the time when the token-list is 
> parsed. However given how extremely rare duplicated classnames are (I 
> can't recall ever seeing it in a real page), I think any code spent on 
> dealing with it is a waste.

Yeah.


On Tue, 28 Jul 2009, Sylvain Pasche wrote:
> Jonas:
> >> The remove() algorithm is about 50 lines with whitespace and 
> >> comments. After all, that's not a big cost and I guess that 
> >> preserving whitespace may be closer to what DOMTokenList API 
> >> consumers would expect.
> >
> > The code would be 7 lines if we didn't need to preserve whitespace:
> >
> > nsAttrValue newAttr(aAttr);
> > newAttr->ResetMiscAtomOrString();
> > nsCOMPtr<nsIAtom> atom = do_GetAtom(aToken);
> > while (newAttr->GetAtomArrayValue().RemoveElement(atom));
> > nsAutoString newValue;
> > newAttr.ToString(newValue);
> > mElement->SetAttr(...);
> >
> > If you spent a few more lines of code you could even avoid serializing
> > the token-list and call SetAttrAndNotify instead of SetAttr.
> 
> That's an interesting comparison. Less code and much more readable than 
> my remove() implementation I have to say.

I'm somewhat reluctant to make the DOMTokenList API destructive.


On Mon, 27 Jul 2009, Jonas Sicking wrote:
> >
> > In general, I try to be as conservative as possible in making changes 
> > to the DOM. Are the algorithms really as complicated as you're making 
> > out? They seem pretty trivial to me.
> 
> At least in the gecko implementation it's significantly more complex 
> than not normalizing whitespace. The way that the implementation works 
> in gecko is:
> 
> When a class attribute is set, (during parsing or using setAttribute)
> we parse the classlist into a list of tokens. We additionally keep
> around the original string in order to preserve a correct DOM
> (actually, as an optimization, we only do this if needed).
> 
> This token-list is then used during Selector matching and during
> getElementsByClassName.
> 
> So far I would expect most implementations to match this.
> 
> It would be very nice if DOMTokenList could be implemented as simply
> exposing this internal token list. With the recent change to not
> remove duplicates reading operations like .length and .item(n) maps
> directly to reading from this token list. All very nice.
> 
> However writing operations such as .add and .remove requires operating 
> on the string rather than the internal token-list. The current spec 
> requires .remove to duplicate the tokenization process (granted, a 
> pretty simple task) and modify the string while tokenizing. It would be 
> significantly simpler if you could just modify the token-list as needed 
> and then regenerate the string from the token-list.

I've left it as is for now, but if other implementors agree that it would 
be significantly better to change it to normalise whitespace each time, I 
don't feel too strongly about it.

(We're agreed that removing would remove all duplicates, and that the 
order would be preserved, right?)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 30 July 2009 18:38:44 UTC