- From: James Graham <james@hoppipolla.co.uk>
- Date: Thu, 01 May 2014 23:20:36 +0100
- To: Andy Hickman <andy.hickman@digitaltv-labs.com>, public-test-infra@w3.org, "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>
On 01/05/14 21:57, Andy Hickman wrote: > On 30/04/2014 15:07, James Graham wrote: >> On 30/04/14 14:24, Robin Berjon wrote: >>> I *can* however think of ways in which the IDs could be maintained >>> automatically in a third-party system. IIRC testharness expressed >>> unhappiness when two test cases inside a given file have the same test >>> name. This means that at a given commit, the {file name, test name} >>> tuple is unique: an ID can be assigned to it. A database tracking the >>> repository can then: >>> >>> 1) Track git moves (as well as removals and additions) in order to >>> maintain the identifier when the file part changes. >>> 2) Track addition and removal of test names per file with on-commit >>> runs. (There is infrastructure that should make it possible to extract >>> all the test names easily, including generated ones — we can look at the >>> details if you decide to go down that path.) >> >> So, FWIW we have not dissimilar requirements; we want to track which >> tests we are expected to pass, which we are expected to fail, and >> which have some other behaviour. At the moment the way we do that is >> to identify each test with a (test_url, test_name) tuple, much like >> Robin suggested. Then we generate a set of files with the expected >> results corresponding to each test. These are checked in to the source >> tree so they can be versioned with the code, and when people fix bugs >> they are expected to update the expected results (hence the use of a >> plain text format rather than a database or something). >> >> When we import a new snapshot of the test database (which is expected >> to be as often as possible), we regenerate the metadata using a build >> of the browser that got the "expected" results on the old snapshot. In >> principle it warns when the a result changed without the file having >> changed between snapshots. Obviously there are ways that this system >> could fail and there would be ways to track more metadata that could >> make it more robust; for example we could deal with renames rather >> than mapping renames to new tests. However in the spirit of YAGNI >> those things will be fixed if they become pain points. >> >> (apologies for the slightly mixed tense; this system is in the process >> of being finished and deployed). >> > Apologies if I'm missing something but the tuple tracking suggestion > seems a pretty complex and potentially brittle solution to something > that could be fairly trivially solved (if there wasn't a huge legacy of > test cases...). At least for the use cases I am interested in, I don't think we can do better than the tuple suggestion. I will try to explain why below. > In RDBMS terms, let's take the example of trying to be able to reliably > identify a record in a table over time. Sure you could use two columns > whose values can change (e.g. to correct typos) and form an ID out of > the tuple of the two column values, track changes to those tuple values > over time, and then separately hold a map of generated ID to current > tuple elsewhere.... Or you could just have a column which contains a > unique, unchanging ID for that record. > > My mental analogy is that we're designing a database table to store > people details and you guys are suggesting using a "forename", > "surname", "date of birth" tuple plus some clever mechanisms to ensure > that this info remains unique and that changes are tracked, whereas the > usual RDBMS design pattern would be to have a unique ID index column on > the original table. My analogy is probably wrong, but I'd be grateful if > you could explain why! > In a database, you typically have very different constraints. For example until you are working at huge scale and need to care about sharding; the canonical solution to unique ids is "large integer field set to autoincrement". The success of that design relies on the fact that each insert operation is isolated, so it's always clear what a valid unused id is. In the case of the test system, it's extremely unclear what a valid unused id for every new test is; we have multiple simultaneous contributions from a large number of parties and no good way of coordinating them. It would be totally impractical to have to go through each pull request and add a unique number to every test, for example. Clearly an autoincrementing integer isn't going to cut it. So there are two ways of dealing with this; we either try for globally unique ids, or we allow for the possibility of collisions but make them easy to deal with. If we wanted globally unique ids, it would probably mean using something like a random uuid. For example we could give each test a name like aec2cf60-d17a-11e3-80c1-cbadd29e6cd4. If we did that for both filenames and the names of tests within files we would have a way of identifying every test that wasn't prone to collisions. This is quite similar to the way that git and other version control systems work under the hood. I hope it's obvious that this setup would be awful in practice; people would resent the overhead of generating these names and refuse to submit tests to the testsuite, any attempt to communicate using the ids would be a nightmare, and people who did bother to submit tests would likely copy ids between files rather than generating new ones each time. For files that generate multiple tests from data it would be even worse; it's not clear at all how to generate a unique, but stable, id for each test in that case. The other option is to allow the possibility of name clashes, but make them easy to resolve. One way to do this would be to require one test per file and use the path to the file as the unique test identifier. It is possible that two people could simultaneously submit different tests with the same name, but if it happened the conflict would be rather easy to resolve. However this system has one showstopper-level disadvantage; the lack of multiple tests per file makes test authors dramatically less productive, so we lose significant coverage. That's not an acceptable tradeoff. So finally we come to the solution where we allow multiple tests per file, and give each test a human-readable name. This only requires local coordination (we need to ensure that each file name is unique and each test name is unique, but don't need to compare to any global state), doesn't require using human-unfriendly identifiers like uuids, and allows test authors to be productive by supporting many tests in a file. It also has some more side benefits; by requiring that each test has a unique title we get some metadata about what is being tested for "free". This dramatically reduces the need to have a separate description of the test intent. Clearly this solution is a tradeoff, but it's one that works well. > Would it be fair to say that supporting unique test IDs wasn't a design > requirement when the harness/runner framework was put together and now > we are where we are it's easier to use the suggested approach than to > assign unique test IDs and have to retrofit them to thousands of test > cases? No, having unique test ids is absolutely a requirement. As I said before, we run all the tests and need to keep track of what results we got for each test on previous runs, such that we know if any changed unexpectedly or not. This depends on knowing which result corresponds to each test in a way that is stable across runs. It's just that there are other requirements that shape the form those ids can take. I have also previously worked on a system that could store these test results in a database, so I know that's possible too. > One other thing: it wasn't clear to me how your proposal would work is a > test name is changed? A test name being changed is like deleting the old test and adding a new one. But there just aren't many cases where people come in and change a whole load of test names without also making substantive changes to the tests, so I don't think this is a huge problem.
Received on Thursday, 1 May 2014 22:21:02 UTC