- From: Stuart Langridge <sil@kryogenix.org>
- Date: Wed, 7 Nov 2012 10:34:54 +0000
- To: whatwg@lists.whatwg.org
I'm the author of http://www.kryogenix.org/code/browser/sorttable/, a moderately popular JavaScript table sorting script. As such, I have about nine years worth of anecdata about how authors want their HTML tables to be sorted, the sorts of things they request, and issues that may be worth taking into consideration. These are not particularly in order; they're just things that I think are relevant. Sorttable.js, my script, has the guiding principle of not needing configuration in most cases. Therefore, it attempts to guess the type of a table column: if a column looks like it contains numbers, sorttable will use numeric sort (1 before 2 before 100) rather than alphanumeric sort (1 before 100 before 2); if a column looks like it contains date information, then sorttable will sort by date (for formats DD/MM/YYYY and MM/DD/YYYY). The algorithm used for this guessing is pretty naive (check the first cell in a column; if it's blank, check the next one; etc). I think that this, by itself, has accounted for sorttable's popularity, because in most cases, it Just Works; you add a <script> element pointing to the script, and class="sortable" to the <table>, and do *nothing else*, and your table is sortable without any configuration. Everything else below here is configuration-based: something you'd have to do explicitly as an author. The above point is the critical one; guessing column types to make table sorting be zero-config. Some alternative scripts require you to explicitly tag date or numeric columns, and I think that authors see that as annoying. Anecdata, of course. Sorttable also allows authors to specify "alternate content" for a cell. That is (ignore the invalid HTML attribute here; I didn't know any better, and we didn't have data-* attributes when I wrote this stuff) <td sorttable_customkey="11">eleven</td> This is basically useful for when you have table data which has a definite order but it can't be autoguessed, or (more usefully still) when it could be autoguessed but that would be hard. The canonical example of this is dates: it would be exceedingly annoying, given <td>Wed 7th November, 10.00am GMT</td> to have to parse that cell content in JavaScript to turn it back into a Date() so it can be placed in sort order with other dates. The sorttable.js solution is to specify a "custom key", which sorttable pretends was the cell content for the purposes of sorting, so <td sorttable_customkey="20121107-100000">Wed 7th November, 10.00am GMT</td> and then the script can sort it. This feature is basically the get-out clause, an author hook for saying "I know what I want, but your fancy sorting thing can't handle it; how do I override that?" They can specify custom keys for all their TDs and then sorting will work fine. (Obviously, dates are less of a problem in theory today with <date> elements, but... how does the script know to use the datetime attribute of the <date> in <td><date>...</date></td>?) In roughly descending order of popularity, here is what I've been asked questions about, over the last decade or so: 1. Sorting tables inserted after page load. This is obviously not a problem (sorting a table created with JS rather than in the base HTML), and sorttable should handle it without explicit action from the author to "mark" a table as sortable, but it doesn't because of laziness from me. I include it for completeness because sorttable not handling it generates probably a third of all the sorttable complaint email I receive; a properly specced sortable tables implementation in browsers would obviously handle this and wouldn't need to even have it specified. 2. Sorting a table on page load. That is: a table in HTML containing unsorted data should be sorted by the browser when the page loads, without user action. Sorttable doesn't do this because I think it's wrong (if you want sorted data when the page loads, serve it as sorted in the HTML), but lots of people ask for it. 3. Multiple header rows. Many authors have two or more <tr>s in the <thead>, one of which contains rowspanned <th>s, to group columns together. If this happens, which <th>s are clickable to sort the table? Which are not? This is hard to autodiagnose (and indeed sorttable punts on it and picks the first one, which is almost certainly wrong; even naively picking the last <tr> inside <thead> would be better, but still imperfect). 4. Handling colspans and rowspans in the table. Sorttable.js basically punts on this, because what's expected to happen when you sort a column which contains only half a cell (because the other half's in another column, with rowspan=2) is wildly author-specific. But a properly specced solution doesn't get to punt and say "unsupported". This will need some thought. 5. Numeric sort handling exponented numbers such as 1.5e6 (which do not match a naive "is this a number" regexp such as /^[0-9]+$/ ) 6. Specifying how to display that a column is sorted. This would likely be done in this specification by leaving it to CSS and th::sorted-forward { after: content("v"); } or some such thing (I have no policy suggestions here), but authors want to be able to specify this, along with different styles for a sorted column. This is mildly more awkward because there's no real concept of a column in the DOM of an HTML table, but perhaps all the TDs could grow a pseudo ::sorted-forward or something (handwaving here like mad, obviously). 7. Case sensitivity in alphannumeric sorting. Some people like it, some people don't; it's good to have some sort of author-controllable switch. (Obviously solveable with <td sorttable_customkey="INSENSITIVE">Insensitive</td> in the limit case, and this, like many other things on this list, suggests that some sort of "here is the JavaScript function I want you to use to produce sort keys for table cells in this column" function is a useful idea. Sorttable allows this, and people use it a lot.) 8. Mark a column as not sortable. Note: this does not mean that clicking on that column doesn't sort it; it means that that column does not get sorted *even when the rest of the table does*. This gets requested for a sort of "left-hand header" concept, where the first column contains numbers, 1, 2, 3, 4 etc, one per row, to show which is row 1, row 2, row 3 etc of the table. Obviously this column should not be sorted when the rest of the table is. I'm not sure there's any good markup for this in HTML (<ol>s do it, but there's no <ol> concept for <tr>s). 9. A commonly requested type of things to know how to automatically sort is IP addresses. (I solve this by forwarding people the email explaining how to add a new sort type function to sorttable, because I've never got around to adding it to the script.) 10. Zebra-striped tables are a problem. Well, they're not a problem if you're striping with CSS (#mytable tr:nth-child(2n) td { background: #eee; }) but an awful lot of people bake the stripes into their HTML (<tr class="even">), and this gets screwed up if you sort the table. The solution here obviously might be to poke authors to do presentational stuff with CSS instead and then their problems go away, but *lots* of people complain about this. 11. Authors like the idea of having script callbacks before and after a user action to sort, so they can do things to the table, show progress or an hourglass, etc. This would presumably be neatly handled by firing a "sort" event on the table or similar. 12. Stable sort: I recommend that the sort that's implemented be specified as being a stable sort, because people who care really want it and write me annoyed emails that it's not there, and no-one explicitly wants unstable sort. :) 13. What happens if a table has multiple <tbody> elements? Do they sort as independent units, or mingle together? Sorttable just sorts the first one and ignores the rest, because multiple tbodies are uncommon, but that's not really acceptable ;-) 14. Fixed-position rows. Many authors have a "totals" row at the bottom of their table which should remain at the bottom of the table even after sorting, which is easily handled (that's what <tfoot> is for), but some authors also have rows midway through the table which are "headers": this especially shows up in long tables, where the column headers from <thead> are repeated midway down the table and should remain in position even when the table is sorted. In general this means that they should remain the same number of rows away from <thead>. This case is odd, and sorttable.js doesn't handle it, but lots of people ask for it. I hope that's useful. Happy to answer questions. sil -- New Year's Day -- everything is in blossom! I feel about average. -- Kobayashi Issa
Received on Wednesday, 7 November 2012 10:35:48 UTC