Re: Testing parse-html

I have added misc/HtmlTestSuite with about 1300 parse-html() tests culled from the HTML5 test suite. Saxon is not yet passing them, so I have little idea whether they are correct; please review them as best you can. Please avoid fixing any tests "by hand"; if they are wrong, we should try to fix the generaor tools (also committed) and regenerate them.

They're currently using deep-equal() to test results, which of course gives very poor diagnostics for a test failure. I might try to introduce a better comparison function -- or perhaps we should improve deep-equal.

Michael Kay
Saxonica

On 28 Dec 2022, at 17:18, Reece Dunn <msclrhd@googlemail.com<mailto:msclrhd@googlemail.com>> wrote:

On Wed, 28 Dec 2022 at 16:06, Michael Kay <mike@saxonica.com<mailto:mike@saxonica.com>> wrote:

You need to use the `Document.outerHtml()` method in conjunction with the `Document.outputSettings()` properties to get it to serialize XHTML output. So you could do something like:

Thanks. That helps, but it doesn't output a namespace declaration for the XHTML namespace.

I'm tweaking the HtmlDocument -> DomDocument conversion to try and get namespaces right.

One test has

<script id='script' href='testScripts/externalScript1.js'
          xlink:href='testScripts/externalScript2.js'></script>

Any idea what the expected XDM representation is? Do we recognize xlink as a magic prefix?

The https://html.spec.whatwg.org/#attributes-2 (13.1.2.3 Attributes) section allows xlink:actuate, xlink:arcrole, xlink:href, xlink:role, xlink:show, xlink:title, xlink:type, xml:lang, xml:space, xmlns, and xmlns:xlink namespaced attributes for foreign (MathML and SVG) elements. It defines those namespaces in https://infra.spec.whatwg.org/#namespaces (8. Namespaces) where xlink is "http://www.w3.org/1999/xlink". Note also that MathML and SVG elements will be in those respective namespaces.

Regarding the use of xlink on other (non-MathML/SVG elements), there is the following example in https://html.spec.whatwg.org/#coercing-an-html-dom-into-an-infoset (13.2.9 Coercing an HTML DOM into an infoset):

> As another example, consider the attribute xlink:href. Used on a MathML element, it becomes, after being adjusted<https://html.spec.whatwg.org/#adjust-foreign-attributes>, an attribute with a prefix "xlink" and a local name "href". However, used on an HTML element, it becomes an attribute with no prefix and the local name "xlink:href", which is not a valid NCName, and thus might not be accepted by an XML API. It could thus get converted, becoming "xlinkU00003Ahref".

- Reece

Received on Thursday, 29 December 2022 16:29:58 UTC