The tool was really useful

https://www.w3.org/2003/12/semantic-extractor.html

Now it always report that document before root must be well formed

Is it because it doesn't work with HTML 5 ?

Check yourself

It says the same with all these sites

www.RecuperoDati299euro.it 
www.RecuperoDatiRAIDFAsTec.it 
www.Recupero-Dati-NAS-RAID5.it 

the code is pure html 5

but you always get this message


Using org.apache.xerces.parsers.SAXParser
Exception net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: http://services.w3.org/tidy/tidy?docAddr=http%3A%2F%2Fwww.Recupero-Dati-NAS-RAID5.it&passThroughXHTML=1; lineNumber: 1; columnNumber: 3; The markup in the document preceding the root element must be well-formed. 
org.xml.sax.SAXParseException; systemId: http://services.w3.org/tidy/tidy?docAddr=http%3A%2F%2Fwww.Recupero-Dati-NAS-RAID5.it&passThroughXHTML=1; lineNumber: 1; columnNumber: 3; The markup in the document preceding the root element must be well-formed.


What has gone bad with the semantic extractor?

Thank you

Roberto