Re: after installing personal validator, cannot validate xhtml

On Jun 18, 2010, at 8:27 AM, john gale wrote:

> On Jun 18, 2010, at 8:13 AM, Andreas Prilop wrote:
> 
>> On Thu, 17 Jun 2010, john gale wrote:
>> 
>>> it cannot seem to validate any XHTML (1.0 strict, transitional, 1.1)
> ...
> In my previous email I gave an example of what I had entered by source that failed.
> 
>> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> 
> In the w3.org validator this gives one or two errors about not having any body content, but in my personal validator it throws many numbers of errors containing invalid line numbers.
> 
> I don't have the examples that I'm using currently, because it's a validator for an internal network, validating pages that are not exposed to the external internet.  But, as an example of something else it fails to validate, http://calbach.org is the closest I can see.
> 
> I'm trying to determine if it's the perl modules that are failing somehow.  It was a hack to install OpenSP and the Parser::SGML::OpenSP module onto this machine given new versions with crooked dependencies, so maybe that's throwing the validator.


There are a few things here that seem odd:

• this site validates plain HTML (2, 3, and 4) fine

• the first error is a non-SGML character error with a line number that does not exist (rather, it does exist, but it's a blank line, and whether I validate with a real webpage (calbach.org) or the bogus DOCTYPE line as raw source, it provides the exact same line number both times.)

> Line 195, Column 23: reference to non-SGML character


• after it displays tons of errors with more radical line numbers, it eventually starts displaying bogus errors with real line numbers:

Line 864, Column 26: omitted tag minimization parameter can be omitted only if OMITTAG NO is specified
Line 867, Column 10: character ":" invalid: only "CDATA", "ENTITIES", "ENTITY", "ID", "IDREF", "IDREFS", "NAME", "NAMES", "NMTOKEN", "NMTOKENS", "NOTATION", "NUMBER", "NUMBERS", "NUTOKEN", "NUTOKENS" and parameter separators allowed
Line 3, Column 13: there is no attribute "XMLNS"
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">


• the source listing, when displayed, shows correctly with line numbers that look fine.  I cannot see where it thinks a "line 867" is coming from.

• when listing my CPAN modules `perldoc perllocal` it shows a missing version for my SGMLParser, although my OpenSP seems fine:

Fri Jun 18 12:05:38 2010: "Module" SGMLParser
"installed into: /Library/Perl/5.10.0"
"LINKTYPE: dynamic"
"VERSION: "

Wed Jun  2 17:07:50 2010: "Module" SGML::Parser::OpenSP
"installed into: /Library/Perl/5.10.0"
"LINKTYPE: dynamic"
"VERSION: 0.994"

• running `cpan test <module>` for SGML::Parser::OpenSP, and XML::LibXML pass fine, although no tests are defined for SGML::Parser


I'm still suspecting a problem with my OpenSP installation, since something is suggesting that it's not parsing the SGML correctly.  Other than running the cpan unit tests, which seem to pass, is there a way to verify that the validator is parsing XML (and XHTML) documents correctly?  Is there more debug output that I can enable in the check.pl script?  Are there any suggestions for best places to throw in my own debug output into the check script?  Since it uses so many modules that I'm not familiar with it's rather hard to read.

Thanks for any help you can give,

	~ john

Received on Friday, 18 June 2010 19:29:18 UTC