Re: & really?

Le jeudi 24 juillet 2008 à 04:34 -0400, Jon Diamond a écrit :
> Please don't mistake this as a suggestion for the markup validation
> tool. Although I would love to be able to feed multiple pages for
> evaluation... but for the
> http://www.w3.org/2003/12/semantic-extractor.html tool... why can't
> you just disregard invalid character entities?

The reason for this is that the semantic extractor tool is mostly an
XSLT style sheet, that is to say that it relies on its input to be
well-formed XML; since the relevant content is already supposed to be
XML (since XHTML is an application of XML), the semantic extractor
doesn't try to transform it into XML beforehand, and thus fails on this
well-formedness bug.

You can see what you would get with a well-formed content at:
http://www.w3.org/2005/08/online_xslt/xslt?xmlfile=http%3A%2F%
2Fcgi.w3.org%2Fcgi-bin%2Ftidy%3FdocAddr%3Dhttp%253A%252F%
252Fwww.imageworksstudio.com%252F&xslfile=http%3A%2F%2Fwww.w3.org%
2F2002%2F08%2Fextract-semantic.xsl

HTH,

Dom

Received on Thursday, 24 July 2008 12:15:35 UTC