- From: Jirka Kosek <jirka@kosek.cz>
- Date: Fri, 23 Dec 2022 23:31:00 +0100
- To: Michael Kay <mike@saxonica.com>, "public-xslt-40@w3.org" <public-xslt-40@w3.org>
- Message-ID: <47528d4c-a4b6-2642-294c-cd282590a257@kosek.cz>
On 22.12.2022 1:06, Michael Kay wrote: > I've just been running a few new tests on our existing parse-html() function on SaxonJ (built on TagSoup) and SaxonCS (built on HtmlAgilityPack) and reallising how different they are. I suspect that getting a good level of interoperability (and tests to prove it) for fn:parse-html is going to be challenging! Hi, I think it would be good to have parsing consistent with web browsers which means implementing HTML5 parsing algorithm. I have been using the following parser when I needed to process HTML5 input by XSLT: https://about.validator.nu/htmlparser/ Perhaps switching to this parser from TagSoup would give better results if some other HTML5 compliant parser would be used in .NET product as well. Jirka -- ------------------------------------------------------------------ Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz ------------------------------------------------------------------ Professional XML and Web consulting and training services DocBook/DITA customization, custom XSLT/XSL-FO document processing ------------------------------------------------------------------ Bringing you XML Prague conference http://xmlprague.cz ------------------------------------------------------------------
Received on Friday, 23 December 2022 22:31:16 UTC