- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 28 Dec 2012 02:04:25 +0000 (UTC)
- To: whatwg@whatwg.org
- Message-ID: <Pine.LNX.4.64.1212270141200.16292@ps20323.dreamhostps.com>
I've added a feature to HTML to enable users (and authors) to sort tables. The basic design of the feature is that if a column's <th> has a sorted="" attribute, the UA will sort the table every time the mutation observers would fire (before they fire). A table can have a sortable="" attribute, which lets the user tell the user agent to add sorted="" attributes to columns to sort them. On Tue, 6 Nov 2012, Ojan Vafai wrote: > On Tue, Nov 6, 2012 at 11:25 AM, Ian Hickson <ian@hixie.ch> wrote: > > On Thu, 1 Jul 2010, Christoph Pper wrote: > > > > > > For starters, only rows inside tbodys shall be reordered. For now > > > columns dont have to be reordered, ie. only vertical, no horizontal > > > sorting. Done. > > > Nevertheless the design should make it possible to add the other > > > direction later. Well I guess nothing would stop us supporting sorted="" on <th>s at the front of a row, but boy, that would be a lot more complicated to do. You'd have to be moving cells around all over the place. > > > Not every table has content that makes sense to be sorted in a > > > different order. So sortable tables should be marked as such. Note > > > that col and colgroup elements are hardly supported. <table sortable>. > > > Not every column has content that makes sense to be sorted in a > > > different order. So non-sortable columns inside sortable tables > > > should be marked as such. Any column with a <th> is sortable, for now. We can add a "nosort" column or something later if this becomes a problem. > > > There are different ways to sort, eg. numeric, temporal or > > > alphabetic and ascending or descending. Therefore columns should > > > bear information how they should be sorted, ie. what kind of content > > > their cells have. Ascending/descending is supported (sorted="reversed"). Any temporal syntax supported by <time> can be used by putting <time> as the only child of the cells to sort. I intend to spec some sort of algorithm for doing numeric/string comparison, but haven't yet come up with a good solution. If you have any suggestions, this is the bug tracking this issue: https://www.w3.org/Bugs/Public/show_bug.cgi?id=20524 > > > Several columns may be used for sorting by some kind of priority. You can set sorted="" on multiple columns' headers, and give a sort key cardinality in each, as in sorted="1", sorted="2", etc. > > > The original order must be restorable. This I have not supported. I don't see how to support it sanely. > > > Cell content may not consist of the string that should be used > > > verbatim for sorting purposes, eg. leading articles or similar > > > numbers with different units (g, kg, t ). Cells should have > > > an optional attribute indicating their sort key. The time element > > > already provides the necessary metadata features for temporal > > > sorting maybe there should be more of such elements instead. I've used <data> for this, alongside <time>. > > > There may be columns that shall remain stable, eg. rank numbers. I haven't supported this. I've no idea how to do this sanely, especially given cells with column and row spans. > 1. Would sorting actually reorder the DOM nodes or just change their > visual order? It's not clear to me which one is better. I think the > former is what you'd want most of the time. I've gone with reordering the DOM nodes. Things like :nth-child styling become nigh on impossible without doing it at the DOM level, not to mention the confusion that would reign from having such a dramatic disconnect between rendering and DOM (e.g. with abs pos, etc). > 2. What values should the sort property allow. One idea is that it takes > a JS function similar to what JavaScript's sort function takes. If you > leave it out then it just does alphanumeric sort. I was going to have a comparator function, but I couldn't see a sane way to make it work in the face of hostile functions that mutate the DOM, so I dropped it. You can do custom sort orders by giving a key in the <data> element's value="" attribute, though. > 3. What elements does it go on? I don't see what it would do on a td. I > could see putting it on a th though. Also, it's not clear to me what > would get sorted. For example, in some tables, you would group trs > inside tbodys and want to sort those. sorted="" goes on a column-heading <th>, ideally in a <thead> but you can also put it on the first row of your <tbody> if you don't have a <thead>. Rows are sorted on a per-group basis. Rows that span each other are treated as one row for sorting. On Tue, 6 Nov 2012, Boris Zbarsky wrote: > > Another obvious question: how does (or should) sorting interact with > rowspans? The sort algorithm groups rows that span each other together and treats them as one (using the data in their top row for sorting). On Wed, 7 Nov 2012, Silvia Pfeiffer wrote: > > http://tympanus.net/codrops/2009/10/03/33-javascript-solutions-for-sorting-tables/ Interesting, thanks. > Also, a sortable table's header needed some indication of the sortability, > so some default CSS like this: > th.sortable { > &:after { content: " ▲▼"} > &.current{ > &[data-direction="asc"]:after { content: " ▼"} > &[data-direction="desc"]:after { content: " ▲"} > } > } I haven't defined the styling in detail, pending both user agent implementation experience and the addition of :sorted to CSS. On Wed, 7 Nov 2012, Silvia Pfeiffer wrote: > On Wed, Nov 7, 2012 at 8:37 PM, Jirka Kosek <jirka@kosek.cz> wrote: > > > > It would be very difficult to support sorting on dates and numbers as > > in HTML they are usually present formatted using specific locale. So > > there should be additional attribute added to td/th which can hold > > sort key which will override cell contents, something like > > > > <td sortas="2012-11-07">11. listopadu 2012</td> <td><time datetime="2012-11-07">11. listopadu 2012</time> On Wed, 7 Nov 2012, Stuart Langridge wrote: > > I'm the author of http://www.kryogenix.org/code/browser/sorttable/, a > moderately popular JavaScript table sorting script. As such, I have > about nine years worth of anecdata about how authors want their HTML > tables to be sorted, the sorts of things they request, and issues that > may be worth taking into consideration. These are not particularly in > order; they're just things that I think are relevant. Thank you very much for your input, it was invaluable. > Sorttable.js, my script, has the guiding principle of not needing > configuration in most cases. Therefore, it attempts to guess the type of > a table column: if a column looks like it contains numbers, sorttable > will use numeric sort (1 before 2 before 100) rather than alphanumeric > sort (1 before 100 before 2); if a column looks like it contains date > information, then sorttable will sort by date (for formats DD/MM/YYYY > and MM/DD/YYYY). The algorithm used for this guessing is pretty naive > (check the first cell in a column; if it's blank, check the next one; > etc). I think that this, by itself, has accounted for sorttable's > popularity, because in most cases, it Just Works; you add a <script> > element pointing to the script, and class="sortable" to the <table>, and > do *nothing else*, and your table is sortable without any configuration. I intend to do something along those lines for HTML's sorting algorithm also, though that is still up in the air (see above). > Everything else below here is configuration-based: something you'd have > to do explicitly as an author. The above point is the critical one; > guessing column types to make table sorting be zero-config. Some > alternative scripts require you to explicitly tag date or numeric > columns, and I think that authors see that as annoying. Anecdata, of > course. > > Sorttable also allows authors to specify "alternate content" for a cell. > That is (ignore the invalid HTML attribute here; I didn't know any > better, and we didn't have data-* attributes when I wrote this stuff) > > <td sorttable_customkey="11">eleven</td> <td><data value="11">eleven</data></td> > This is basically useful for when you have table data which has a > definite order but it can't be autoguessed, or (more usefully still) > when it could be autoguessed but that would be hard. The canonical > example of this is dates: it would be exceedingly annoying, given > <td>Wed 7th November, 10.00am GMT</td> to have to parse that cell > content in JavaScript to turn it back into a Date() so it can be placed > in sort order with other dates. The sorttable.js solution is to specify > a "custom key", which sorttable pretends was the cell content for the > purposes of sorting, so <td sorttable_customkey="20121107-100000">Wed > 7th November, 10.00am GMT</td> and then the script can sort it. <td><time datetime="2012-11-07T10:00Z">Wed 7th November, 10.00am GMT</time></td> > This feature is basically the get-out clause, an author hook for saying > "I know what I want, but your fancy sorting thing can't handle it; how > do I override that?" They can specify custom keys for all their TDs and > then sorting will work fine. (Obviously, dates are less of a problem in > theory today with <date> elements, but... how does the script know to > use the datetime attribute of the <date> in <td><date>...</date></td>?) In the case of the spec, if the <td> element's only child is a <time> or a <data>, it knows to use the datetime="" or value="" attributes respectively. > In roughly descending order of popularity, here is what I've been asked > questions about, over the last decade or so: > > 1. Sorting tables inserted after page load. This is obviously not a > problem (sorting a table created with JS rather than in the base HTML), > and sorttable should handle it without explicit action from the author > to "mark" a table as sortable, but it doesn't because of laziness from > me. I include it for completeness because sorttable not handling it > generates probably a third of all the sorttable complaint email I > receive; a properly specced sortable tables implementation in browsers > would obviously handle this and wouldn't need to even have it specified. Supported. > 2. Sorting a table on page load. That is: a table in HTML containing > unsorted data should be sorted by the browser when the page loads, > without user action. Sorttable doesn't do this because I think it's > wrong (if you want sorted data when the page loads, serve it as sorted > in the HTML), but lots of people ask for it. Supported, though I'm not sure how good an idea this will end up being. > 3. Multiple header rows. Many authors have two or more <tr>s in the > <thead>, one of which contains rowspanned <th>s, to group columns > together. If this happens, which <th>s are clickable to sort the table? > Which are not? This is hard to autodiagnose (and indeed sorttable punts > on it and picks the first one, which is almost certainly wrong; even > naively picking the last <tr> inside <thead> would be better, but still > imperfect). The spec picks the highest non-spanning <th> in a column, if there's a <thead>. (If there's not, it uses the top row's <th>, if it doesn't span columns.) > 4. Handling colspans and rowspans in the table. Sorttable.js basically > punts on this, because what's expected to happen when you sort a column > which contains only half a cell (because the other half's in another > column, with rowspan=2) is wildly author-specific. But a properly > specced solution doesn't get to punt and say "unsupported". This will > need some thought. For column spanning, the spec's model basically just acts as if the cell isn't spanning, but is in each column it spans. So e.g. <td colspan=2>X</td> is treated as <td>X</td><td>X</td>, for the purposes of sorting. > 5. Numeric sort handling exponented numbers such as 1.5e6 (which do not > match a naive "is this a number" regexp such as /^[0-9]+$/ ) I'd like to support this as part of the algorithm mentioned bofer: https://www.w3.org/Bugs/Public/show_bug.cgi?id=20524 > 6. Specifying how to display that a column is sorted. This would likely > be done in this specification by leaving it to CSS and > th::sorted-forward { after: content("v"); } or some such thing (I have > no policy suggestions here), but authors want to be able to specify > this, along with different styles for a sorted column. This is mildly > more awkward because there's no real concept of a column in the DOM of > an HTML table, but perhaps all the TDs could grow a pseudo > ::sorted-forward or something (handwaving here like mad, obviously). I haven't specced this yet but once CSS has the :sorted pseudo (bug 20522) I expect we'll be able to do something like: th:sorted(ascending)::after { content: "v"; } > 7. Case sensitivity in alphannumeric sorting. Some people like it, some > people don't; it's good to have some sort of author-controllable switch. > (Obviously solveable with <td > sorttable_customkey="INSENSITIVE">Insensitive</td> in the limit case, I intend to only support insensitive comparisons initially, but if that's a problem we can definitely revisit it somehow. (It can't be worked around easily, unlike the other way around.) > and this, like many other things on this list, suggests that some sort > of "here is the JavaScript function I want you to use to produce sort > keys for table cells in this column" function is a useful idea. > Sorttable allows this, and people use it a lot.) I tried to do this but couldn't figure out a sane way to do it. A comparator can totally destroy the table we're sorting, and I don't know what to do if that happens. > 8. Mark a column as not sortable. Note: this does not mean that clicking > on that column doesn't sort it; it means that that column does not get > sorted *even when the rest of the table does*. This gets requested for a > sort of "left-hand header" concept, where the first column contains > numbers, 1, 2, 3, 4 etc, one per row, to show which is row 1, row 2, row > 3 etc of the table. Obviously this column should not be sorted when the > rest of the table is. I'm not sure there's any good markup for this in > HTML (<ol>s do it, but there's no <ol> concept for <tr>s). I haven't supported this. To some extent, it's presentational, and thus can be done using something like: tr::before { display: table-cell; content: counter(row); } ...or some such. > 9. A commonly requested type of things to know how to automatically sort > is IP addresses. (I solve this by forwarding people the email explaining > how to add a new sort type function to sorttable, because I've never got > around to adding it to the script.) This is something that should end up supported by the sorting algorithm automatically. > 10. Zebra-striped tables are a problem. Well, they're not a problem if > you're striping with CSS (#mytable tr:nth-child(2n) td { background: > #eee; }) but an awful lot of people bake the stripes into their HTML > (<tr class="even">), and this gets screwed up if you sort the table. The > solution here obviously might be to poke authors to do presentational > stuff with CSS instead and then their problems go away, but *lots* of > people complain about this. :nth-child() is more widely supported than this feature, so I think it makes sense to rely on the former if you're relying on the latter. > 11. Authors like the idea of having script callbacks before and after a > user action to sort, so they can do things to the table, show progress > or an hourglass, etc. This would presumably be neatly handled by firing > a "sort" event on the table or similar. I've made 'sort' get fired at the table before the sort starts. Nothing is fired after currently. > 12. Stable sort: I recommend that the sort that's implemented be > specified as being a stable sort, because people who care really want it > and write me annoyed emails that it's not there, and no-one explicitly > wants unstable sort. :) Done. > 13. What happens if a table has multiple <tbody> elements? Do they sort > as independent units, or mingle together? Sorttable just sorts the first > one and ignores the rest, because multiple tbodies are uncommon, but > that's not really acceptable ;-) Independent. > 14. Fixed-position rows. Many authors have a "totals" row at the bottom > of their table which should remain at the bottom of the table even after > sorting, which is easily handled (that's what <tfoot> is for), but some > authors also have rows midway through the table which are "headers": > this especially shows up in long tables, where the column headers from > <thead> are repeated midway down the table and should remain in position > even when the table is sorted. In general this means that they should > remain the same number of rows away from <thead>. This case is odd, and > sorttable.js doesn't handle it, but lots of people ask for it. <tfoot> is supported as suggested. Haven't done it for the mid-rows. Not sure how to make that work while sorting around them. I mean, you'd have to count the number of rows before each one so that you put back the right number of rows or something... On Thu, 8 Nov 2012, Cameron Jones wrote: > > <time> exists, and <data> exists for non-time machine-readable data; > > maybe they can be utilized in some way? > > I have done some investigation in this area too and having concrete > datatypes would make this more utilizable, ie from the proposal for > <data type="" value=""/> > > http://www.w3.org/wiki/User:Cjones/ISSUE-184 > > The other area of integration would be with BCP-47 language tags and the > CLDR which include i18n collation information, for example british > numeric collation: > > en-GB-*u-kn-true* > > The significant benefit with this is that this standard is already > universal across server\client and is of course fully internationalized. > > The other aspect of this is that there is a distinction between server > pagination including sort ordering defining the content of a page and > the client-based sorting which would be more of a presentational > customization and outside the scope of pagination. As such, it may be > better for the HTML to markup the structure of the content with sorting > and collation but for this to be configurable through CSS without the > structural DOM changes. > > This could also apply to HTML lists: <ul> <ol>, <dl>. I haven't added this. I'm curious as to the use cases and how much implementation interest there is (I guess this would primarily be for validators?). On Thu, 8 Nov 2012, Alex Russell wrote: > > I'm much more inclined to solve this from the data axis. Asking the > table itself to do the sorting is weird. Instead, you most often want to > have some data source return you rows in sorted order (or indicate row > order). If you do something like MDV, sorting the table is applying a > sort to the template that stamped out the view. That works with > DOM-table backed tables as well as server or JS-backed tables. I'm happy to strip out the current text in the spec and add in something more like this model if there's implementation and author interest, but I don't really understand what you are proposing. Can you elaborate? On Wed, 7 Nov 2012, Christoph Pper wrote: > > >> Note that ‘col’ and ‘colgroup’ elements are hardly supported. > > But they’re essential for assigning sort properties. > > <col key=…> > <colgroup key=…> I ended up using <th> for this instead. > To support this, cells must be splittable! > > td {color: green;} > #split {color: red;} > > <tr><td>3 <td id=split colspan=2> red > <tr><td>1 > <tr><td>2 <td> green > > after sorting by the first column should look like > > <tr><td>1 <td id=split> red > <tr><td>2 <td> green > <tr><td>3 <td id=split> red > > would if duplicate IDs were legal. The DOM tree, however, would not > change! The value of the cell at position (1,1), i.e. second row and > column since we count from zero, is always undefined, but the value of > the slot at (1,1) changes from “red” to “green”. That's an interesting idea, but I don't think it's the right approach. Some elements are not elements you want to clone (e.g. <audio>, <embed>, <input>). And it's not clear how you remerge them. On Fri, 9 Nov 2012, Pierre Dubois wrote: > > My opinion is that depends of the real scope of the "th" element. > > If the "th" is an empty cell or used for "layout", the sorting > functionality would not be available. > If the "th" is an "group header", the sorting functionality would be > applied to the header cell along with their data fixed. Where the > header cell is a > subgroup header or/and an header that represent one or more row or column. > If the "th" is an "header", the sorting functionality could be applied > to the data cell associated and by default the sorting action would be > extended to the other axis [row|col]. That's an interesting idea. I'm dubious about overloading the logic like this, though, lest it make authors set invalid scope values just to get sorting enabled/disabled. I'd rather just add an attribute that says "this can't be a sort column", if that's really a need. When is it a need, though? I'd love to study a table that has a column that it doesn't make sense to sort by. > Use case: A data table that have row headers and column headers. > Row and column that is in the scope of an rowspans and colspans data > cell (td) would be fixed. Not sure what you mean, but for what it's worth, the spec as written will skip over and rows at the top of <tbody>s that consist of only <th>s. > Use case: A data table that only have row headers. > Row that is in the scope of an rowspans data cell (td) would be fixed. If a data table only has row headers, I'm not sure how to sort it. > Use case: A data table that only have column headers. > Column that is in the scope of a colspans data cell (td) would be fixed. Not sure what this means. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 28 December 2012 02:04:50 UTC