- From: Robin Berjon <robin@w3.org>
- Date: Tue, 11 Dec 2012 13:03:41 +0100
- To: public-html-testsuite@w3.org
Hi all! We've been talking about test suite reorganisation for a while and I thought it might be helpful if I put a concrete proposal out there so that we can all see what sticks and what doesn't, and hopefully come to some conclusion shortly. Unfortunately I won't be able to make it to the meeting today due to speaking at a meetup at the same time (notably about getting people involved in testing — cue irony music); so I'll try to make these notes self-sufficient. Before I dig into it, a few things to know: • I made all the changes in a dedicated branch. Nothing is broken, nothing has been destroyed. • This is just a proposal and of course everything is open to changes. I haven't done any of the complicated parts (such as actually moving the tests around) so I really am not committed to the layout I built. You can see the results in the temp/robin branch on GitHub: https://github.com/w3c/html-testsuite/tree/temp/robin I called this branch "temp" to make it clear that I reserve the right to delete it. So don't build anything atop it please, or you might end up with a broken repo. If you wish to make similar proposals, I encourage you to use the same scheme. What I have done includes: • I've moved the tests/harness and tests/reporting directories to the root to get them out of the way. I'm unsure what to do with those eventually, we need to figure out where best to place them. • Similar thinking needs to be applied to common, images, fonts, etc. directories. If they are shared across all sub-suites, they'll need to be at the root (or straight under tests); if they are specific to sub-suites, then they probably should go deeper inside the tree. • Inside tests, I made five directories: html5, html51, canvas2d, canvas2d2, microdata to reflect the various specs. Technically there might be a microdata2 as well, but there doesn't seem to be much motion there for now so that can wait. • For each of those subsuites, I used the relevant specification to generate a directory tree. The rules I used for that are simple. The names of each subdirectory comes from the ID of the relevant section in the spec. I know that James was worried that those would not be very readable, but in looking at the result I find it to be rather easy to understand (YMMV, feedback welcome). The only sanitisation that the IDs seemed to have required has been replacing / with _. I'm interested in knowing if the result works fine on all FSs (I get no errors on OSX, I'm thinking of Windows in particular as a likely source of divergence here). Overall though, the IDs are quite regular. Producing directories to the full depth of the HTML5 spec would in some cases lead to a rather deep hierarchy, so after a quick chat on IRC I stopped at three levels. When there were subsections and I stopped, I generated a small contains.json file there that captures the subtree. I'm unsure if it would be useful (I guess it could be used for a simpler mapping to the ToC perhaps in tools like PLH's) but we're getting it free anyway. You'll note that there are .gitkeep files in every directory. You can ignore them: they're there because git does not take empty directories into account, and that's the conventional file to include to make sure the tree is there (they can be nuked as content is added). Note that even after content is added, we can still use an automated process to add sections in the tree as and if needed. I think that covers all about the directory structure, comments are dearly welcome as always. Another big topic is how to handle submissions and approved tests. There are several options: A) Use approved and submissions/Foo subdirectories B) Use pull requests C) Use a file that lists what's approved and what isn't I think we should rule option (A) out outright. Tests should be moved around as little as possible, ideally never. Option (B) is interesting, but my concern is getting a view of the entire set of submissions + approved tests. We *could* use the GH API to obtain the full list of pending PRs and extract the content accordingly, but that introduces reliance on the GH API beyond just git (which may not be a problem given http://gitlabhq.com/). Option (C) is the simplest, though it runs the risk of someone forgetting to update the file listing the approved tests (this could however be made more obvious by listing content on both sides clearly, and possibly spamming this group with "pending submissions" every week). Overall I have a slight preference for (C), but I could be convinced to go with (B), especially if integration with epic (or whatever) is particularly good there. If we do go with (C), I would however suggest that we use JSON as the format rather than text, even if it's just for a dumb array of strings. The reason here is that I've seen how our text-based manifests get used, and in those and their processors I've seen: • Unicode errors; • BOM problems; • EOL Win vs Unix problems; • Lack of EOL on the last record which caused that record to be ignored; or conversely EOL on the last record that caused an empty record after it to be read. In other words, pretty much every single classic error in handling text that can be made, will be made. You can screw up JSON too, but with libraries in any language doing it right for you, if that happens you probably should get shot. Anyway, that's it for today's brain dump! -- Robin Berjon - http://berjon.com/ - @robinberjon
Received on Tuesday, 11 December 2012 12:03:54 UTC