Re: Testing parse-html

What is the goal of these tests?

In the W3C days, when poeple asked that question, the answer was twofold:

* W3C's goal was to demonstrate interoperability of implementations
* The goal of implementors investing in creating the tests was to share the cost of developing test material for their products, costs that they would otherwise have had to incur individually

In many ways I think that remains true now.

Yes, we primarily need to test the API and that means showing that we are getting the correct XDM output for a representative range of HTML input. We can't test all edge cases; I suggested that about 1000 documents would probably give reasonable coverage. For parse-json I think we have something like 400, but JSON is a lot simpler. A lot of the JSON tests are actually error tests, and it's not yet clear to me whether parse-html() ever fails.

Is it to test the parse-html API and XDM mapping? -- If so, we should just need enough tests to cover the different XDM mappings and API settings.

Is it to test that the parse-html implemetations support all HTML constructs and parsing quirks? -- If so, we would need to build and maintain (i.e. evolve with the HTML LS spec) a set of HTML parser and tree contruction tests (with input and expected XML output). I think this is out of scope of the proposal, as implementers should be given some leeway in how they handle e.g, HTML4.


The situation where implementations have leeway is exactly the situation where the first objective (interoperability) doesn't apply, but the second (pooling effort among developers to test their products) does. And it's not just that pooling efforts reduces costs, it also improves the quality of testing, because it's always good to test your code against tests written by a third party.

Michael Kay
Saxonica

Received on Friday, 23 December 2022 11:38:17 UTC