Re: Testing parse-html

On 22.12.2022 1:06, Michael Kay wrote:
> I've just been running a few new tests on our existing parse-html() function on SaxonJ (built on TagSoup) and SaxonCS (built on HtmlAgilityPack) and reallising how different they are. I suspect that getting a good level of interoperability (and tests to prove it) for fn:parse-html is going to be challenging!
Hi,

I think it would be good to have parsing consistent with web browsers 
which means implementing HTML5 parsing algorithm. I have been using the 
following parser when I needed to process HTML5 input by XSLT:

https://about.validator.nu/htmlparser/


Perhaps switching to this parser from TagSoup would give better results 
if some other HTML5 compliant parser would be used in .NET product as well.

    Jirka

-- 
------------------------------------------------------------------
   Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz

------------------------------------------------------------------
      Professional XML and Web consulting and training services
DocBook/DITA customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
     Bringing you XML Prague conference    http://xmlprague.cz

------------------------------------------------------------------

Received on Friday, 23 December 2022 22:31:16 UTC