- From: Robin Berjon <robin@w3.org>
- Date: Tue, 11 Dec 2012 13:03:41 +0100
- To: public-html-testsuite@w3.org
Hi all!
We've been talking about test suite reorganisation for a while and I
thought it might be helpful if I put a concrete proposal out there so
that we can all see what sticks and what doesn't, and hopefully come to
some conclusion shortly.
Unfortunately I won't be able to make it to the meeting today due to
speaking at a meetup at the same time (notably about getting people
involved in testing — cue irony music); so I'll try to make these notes
self-sufficient.
Before I dig into it, a few things to know:
• I made all the changes in a dedicated branch. Nothing is broken,
nothing has been destroyed.
• This is just a proposal and of course everything is open to changes. I
haven't done any of the complicated parts (such as actually moving the
tests around) so I really am not committed to the layout I built.
You can see the results in the temp/robin branch on GitHub:
https://github.com/w3c/html-testsuite/tree/temp/robin
I called this branch "temp" to make it clear that I reserve the right to
delete it. So don't build anything atop it please, or you might end up
with a broken repo. If you wish to make similar proposals, I encourage
you to use the same scheme.
What I have done includes:
• I've moved the tests/harness and tests/reporting directories to the
root to get them out of the way. I'm unsure what to do with those
eventually, we need to figure out where best to place them.
• Similar thinking needs to be applied to common, images, fonts, etc.
directories. If they are shared across all sub-suites, they'll need to
be at the root (or straight under tests); if they are specific to
sub-suites, then they probably should go deeper inside the tree.
• Inside tests, I made five directories: html5, html51, canvas2d,
canvas2d2, microdata to reflect the various specs. Technically there
might be a microdata2 as well, but there doesn't seem to be much motion
there for now so that can wait.
• For each of those subsuites, I used the relevant specification to
generate a directory tree. The rules I used for that are simple. The
names of each subdirectory comes from the ID of the relevant section in
the spec. I know that James was worried that those would not be very
readable, but in looking at the result I find it to be rather easy to
understand (YMMV, feedback welcome). The only sanitisation that the IDs
seemed to have required has been replacing / with _. I'm interested in
knowing if the result works fine on all FSs (I get no errors on OSX, I'm
thinking of Windows in particular as a likely source of divergence
here). Overall though, the IDs are quite regular.
Producing directories to the full depth of the HTML5 spec would in some
cases lead to a rather deep hierarchy, so after a quick chat on IRC I
stopped at three levels. When there were subsections and I stopped, I
generated a small contains.json file there that captures the subtree.
I'm unsure if it would be useful (I guess it could be used for a simpler
mapping to the ToC perhaps in tools like PLH's) but we're getting it
free anyway. You'll note that there are .gitkeep files in every
directory. You can ignore them: they're there because git does not take
empty directories into account, and that's the conventional file to
include to make sure the tree is there (they can be nuked as content is
added).
Note that even after content is added, we can still use an automated
process to add sections in the tree as and if needed.
I think that covers all about the directory structure, comments are
dearly welcome as always.
Another big topic is how to handle submissions and approved tests. There
are several options:
A) Use approved and submissions/Foo subdirectories
B) Use pull requests
C) Use a file that lists what's approved and what isn't
I think we should rule option (A) out outright. Tests should be moved
around as little as possible, ideally never.
Option (B) is interesting, but my concern is getting a view of the
entire set of submissions + approved tests. We *could* use the GH API to
obtain the full list of pending PRs and extract the content accordingly,
but that introduces reliance on the GH API beyond just git (which may
not be a problem given http://gitlabhq.com/).
Option (C) is the simplest, though it runs the risk of someone
forgetting to update the file listing the approved tests (this could
however be made more obvious by listing content on both sides clearly,
and possibly spamming this group with "pending submissions" every week).
Overall I have a slight preference for (C), but I could be convinced to
go with (B), especially if integration with epic (or whatever) is
particularly good there.
If we do go with (C), I would however suggest that we use JSON as the
format rather than text, even if it's just for a dumb array of strings.
The reason here is that I've seen how our text-based manifests get used,
and in those and their processors I've seen:
• Unicode errors;
• BOM problems;
• EOL Win vs Unix problems;
• Lack of EOL on the last record which caused that record to be ignored;
or conversely EOL on the last record that caused an empty record after
it to be read.
In other words, pretty much every single classic error in handling text
that can be made, will be made. You can screw up JSON too, but with
libraries in any language doing it right for you, if that happens you
probably should get shot.
Anyway, that's it for today's brain dump!
--
Robin Berjon - http://berjon.com/ - @robinberjon
Received on Tuesday, 11 December 2012 12:03:54 UTC