Re: [whatwg] Sortable Tables

On Fri, 28 Dec 2012, Stuart Langridge wrote:
> > > 
> > > Sorttable also allows authors to specify "alternate content" for a 
> > > cell. <td sorttable_customkey="11">eleven</td>
> >
> > <td><data value="11">eleven</data></td>
> >
> > > The sorttable.js solution is to specify a "custom key", which 
> > > sorttable pretends was the cell content for the purposes of sorting, 
> > > so <td sorttable_customkey="20121107-100000">Wed 7th November, 
> > > 10.00am GMT</td> and then the script can sort it.
> >
> > <td><time datetime="2012-11-07T10:00Z">Wed 7th November, 10.00am 
> > GMT</time></td>
> 
> I can see using <data> for this, because it's deliberately semantically 
> meaningless (right?)

I wouldn't say it's semantically meaningless, but sure.


> but <time> is more of a problem if you have multiple things in one cell. 
> For example, one semi-common pattern is to put some data and an input 
> type=checkbox in a single cell, like
> <td>Wed 7th November, 10.00am GMT <input type="checkbox" name="whatever"></td>

Why can't the checkbox be in a separate cell?


> Using <data> to wrap the whole cell is OK, but using <time> to wrap a 
> bunch of non-time content isn't, really. In this situation would you 
> recommend
> <td><data value="2012-11-07T10:00Z"><time datetime="2012-11-07T10:00Z">Wed 7th November, 10.00am GMT</time></data></td>
> which seems rather redundant to me?

I would recommend using two cells, but you could do that too. It would 
mean the keys were compared as strings, though, rather than as datetimes. 
Things wouldn't work if you mixed <date> and <time> elements with those 
values (e.g. if some cells didn't have checkboxes and so you used just 
<time> in some cases), since strings sort after times.


> > > and this, like many other things on this list, suggests that some 
> > > sort of "here is the JavaScript function I want you to use to 
> > > produce sort keys for table cells in this column" function is a 
> > > useful idea. Sorttable allows this, and people use it a lot.)
> >
> > I tried to do this but couldn't figure out a sane way to do it. A 
> > comparator can totally destroy the table we're sorting, and I don't 
> > know what to do if that happens.
> 
> As in, you specify that there's a comparator function and then the
> sorter passes the comparator function two TD elements for comparing,
> and the comparator function looks like this?
> function comparator(td1, td2) { td1.parentNode.removeChild(td1); }

Right. Or worse (e.g. moving cells around on rows that have been 
compared before).

Also it totally destroys any ability to cache information per-row, which 
I think would be disastrous given how much work it takes to compare rows.


> On the other hand, surely I could make the same argument about any 
> handler, right? If you put <script>document.body.innerHTML += 
> "hahaha!"</script> as a child of <body>, browsers used to crash (because 
> it's an infinite loop), and the implementor response boiled down to 
> "don't do that", at least at first.

Only at first, because it wasn't tenable. We had to eventually define what 
happens, exactly.


> It's hard to see how such a "malicious" script could get into a page 
> without author knowledge --

It might well be with author knowledge. More likely it's a bug in their 
code.


> of course, an author might include a third-party script which does this 
> to destroy a page, but the same third-party script could set 
> document.body.innerHTML to "0wned" which is even more destructive of 
> page content.

That's not really a problem. The problem is making sure that the algorithm 
is stable in the face of crazy comparators, because any lack of stability 
could lead to security bugs (e.g. if you make it crash somehow, and can 
use that to run arbitrary code).


> It would be reasonable, I think, for the sort process to halt 
> uncompleted if a comparator function destroys the things it's comparing, 
> although perhaps your concern is that it's hard to know *whether that 
> happened* (since it might just reparent them to a different table or 
> something)?

It's hard to detect cheaply, certainly.


> Maybe pass a cloneNode of each TD?

Too expensive (what if one of the nodes is a 24 MB image, or a plugin?).


> Or have the sorter work out the sortable *value* of the field (from the 
> content, or the <data value> wrapper) and then pass the values, not the 
> actual cells? Then the comparator can't destroy anything.

It seems to me like that doesn't give you anything that you couldn't do by 
just setting the keys manually on the table before the sort happens (which 
you can do easily onsort="" in the current model).


> > > 13. What happens if a table has multiple <tbody> elements? Do they 
> > > sort as independent units, or mingle together? Sorttable just sorts 
> > > the first one and ignores the rest, because multiple tbodies are 
> > > uncommon, but that's not really acceptable ;-)
> >
> > Independent.
> 
> Hm. They can sort independently, no problem, but how does a user command 
> a sort of one tbody and not the rest?

They can't.


> All the tbodies will identify the same thead tr as their highest one. 
> This suggests that if you've got multiple tbodies in a sortable table 
> and you want the user to be able to sort one tbody independently (and 
> not sort the rest), you should not have a thead at all. We are a long, 
> long way out into unusual-use-case world here, though, so maybe that's 
> OK. Also see the next point.

I meant sort independently as in within each tbody, the rows get sorted, 
but all the tbodies are sorted at the same time.


> > > 14. Fixed-position rows. Many authors have a "totals" row at the 
> > > bottom of their table which should remain at the bottom of the table 
> > > even after sorting, which is easily handled (that's what <tfoot> is 
> > > for), but some authors also have rows midway through the table which 
> > > are "headers": this especially shows up in long tables, where the 
> > > column headers from <thead> are repeated midway down the table and 
> > > should remain in position even when the table is sorted. In general 
> > > this means that they should remain the same number of rows away from 
> > > <thead>. This case is odd, and sorttable.js doesn't handle it, but 
> > > lots of people ask for it.
> >
> > <tfoot> is supported as suggested. Haven't done it for the mid-rows. 
> > Not sure how to make that work while sorting around them. I mean, 
> > you'd have to count the number of rows before each one so that you put 
> > back the right number of rows or something...
> 
> ...which is why sorttable.js doesn't do it, indeed :-) People ask for 
> it, I say "how should it work in the following situations?", they go "um 
> er dunno", and then the conversation ends. I *think* this is mostly 
> solved by having multiple tbodies, actually.

So long as they don't want rows to move from one tbody to another.


> Note that sorttable.js, if you don't specify a thead, creates a thead 
> and reparents the tbody's first row into the thead. If you don't do 
> that, then your "header row" (the first row in the tbody) is *part* of 
> the tbody and so will sort into a different location!

The spec is careful about figuring out which row is the header row and 
skipping it.


> I think the spec needs to be clear that if you choose a row in the tbody 
> as highest row, it's treated as though it's in thead and so doesn't 
> sort... but this is problematic in the case of multiple tbodies, because 
> you can only have one thead, not one per tbody. Perhaps the answer is, 
> as you suggest later, to assume that rows consisting entirely of <th>s 
> at the top of a tbody do not sort, regardless.

Something like that. I forget what the exact implications are of multiple 
tbodies, with ths in the tbodies, and no theads.


> > When is it a need, though? I'd love to study a table that has a column 
> > that it doesn't make sense to sort by.
> 
> Some examples:
> Tables with a column of checkboxes. (This might actually be useful if
> the sorter knew how to derive the sorting value of a cell from a child
> input's checked attribute, but that's not in the spec so far.)

Right now, if the cells in a column are all the same, sorting by that 
column does nothing (the sort is defined to be stable) unless you sort 
twice (in which case it just reverses the whole table). Is it a problem 
that these columns could theoretically appear sortable? It seems fine to 
me. Might even be useful (to reverse the table).


> Left-hand header columns (although as noted this is sorta-kinda 
> presentational and should be done with tr::before, but authors will do 
> it in the HTML for backwards-compatiblity if nothing else).

Not sure I follow that one.


On Fri, 28 Dec 2012, Markus Ernst wrote:
> 
> I believe that "asc" and "desc" would be more intuitive to handle than 
> "" and "reversed"

The problem with "asc" and "desc" is you have to define what "" means, so 
really it means "" and "asc" and "desc", at which point you wonder why 
have synonyms for "asc", so it becomes "" and "desc", and now you're back 
to what the spec has (but with "reversed" instead of "desc" because you 
don't really know if it's truly ascending or descending, you just know 
that the order is the reverse of the default order).


> and I think that some kind of th.sortedState attribute would be handy, 
> to question the actual state of the table.

The actual state, unless you've just changed the attributes and your 
script hasn't exitted yet (so the sort hasn't been comitted), is the state 
given by the attributes.


> Given a basic table such as:
> 
> <table id="pirates">
>   <thead>
>     <tr>
>       <th sorted="1" id="last">Last name</th>
>       <th sorted="2" id="first">First name</th>
>       <th sorted="3" id="age">Age</th>
>       <th sorted="4" id="sex">Sex</th>
>     </tr>
>   </thead>
>   <tbody>
>     <tr>
>       <td>Read</td>
>       <td>Mary</td>
>       <td>25</td>
>       <td>f</td>
>     </tr>
>     <tr>
>       <td>Sparrow</td>
>       <td>Jack</td>
>       <td>32</td>
>       <td>m</td>
>     </tr>
>     ...
>   </tbody>
> </table>
> 
> 1. If the user clicks on the header "Age" (or does a respective 
> interaction provided by the UAs sorting UI), the table should be sorted 
> by the age column. If it is already sorted by this column, the sort 
> direction should be reversed.

Right.


> 2. Authors should be able to provide external links or buttons that can:
> - Sort by a column as described in 1.
> - Sort by a column, force ascending
> - Sort by a column, force descending

You can do that, just change the sorted="" attribute values accordingly.


> I believe that this could be achieved with the following additions:
> - a th.sortedstate attribute to question if the table is currently sorted by
> this column, and if yes in which direction

You can already do that. If the <th> has a "sorted" attribute, it's sorted 
by that column; if it's value contains the keyword "reversed", it's 
reversed. (If it contains a number greater than 1, it's not the primary 
sort column, so you might want to look for another column too.)


> - th.sort() method would take an optional argument to indicate the desired
> sort direction

You can do that as follows:

   th.sort();
   th.sorted = 'reversed';

This will always make that <th> be the primary column, sorted in reverse.


> The algorithm for th.sort([String direction]) could then be extended somehow
> like the following (to be simple I just write "th" for the column header
> element that the method is applied to):
> - Temporarily set the column key ordinality of th to 0
> - If the direction argument is provided (and valid), temporarily set the
>   column sort direction to direction
> - Else if the sortedstate attribute of th is not null
>   - if it is "asc", temporarily set the column sort direction to "desc"
>   - else temporarily set the column sort direction to "asc"
> - Perform the table sorting steps
> - Set the sortedstate attributes of all column headers to null
> - Set the sortetstate attribute of th to the column sort direction
> - Reset the column sort direction and the column key ordinality of th to their
> initial values

Wouldn't that break the way that previous sort columns become secondary 
sort columns?


> Furthermore, a table.sort() method would be handy. It could take a comma
> separated string as an argument, with each token being the ID of a th, and
> optional the direction, such as:
> 
> <button
>   onclick="document.getElementById('pirates').sort('sex asc, age')">
>   Order pirates by age, women first. Click again for descending age.
> </button>

You can do that today without much script already:

<script>
  var headers = document.getElementById('pirates').tHead.rows[0].cells;
  var ascending = false;
  function sort() {
    headers.sex.sort();
    headers.age.sort();
    if (ascending)
      headers.age.sorted = 'reversed';
    ascending = !ascending;
  }
</script>
<button onclick="sort()">
  Order pirates by age, women first. Click again for descending age.
</button>

I don't think a utility method to do this is much of a win. Maybe once 
this sorting algorithm is widely implemented and we see what people do a 
lot, we can consider adding things like this.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Thursday, 18 July 2013 22:48:26 UTC