W3C home > Mailing lists > Public > www-validator@w3.org > February 2000

Re: XHTML validation bug (false pass)

From: David Brownell <david-b@pacbell.net>
Date: Sun, 20 Feb 2000 11:22:54 -0500 (EST)
Message-ID: <38B014DA.F9DFF181@pacbell.net>
To: Terje Bless <link@tss.no>
Cc: W3C Validator <www-validator@w3.org>
Terje Bless wrote:
> 
> On 19.02.00 at 20:10, David Brownell <david-b@pacbell.net> wrote:
> 
> >Basically, it accepts blank lines before XML/text declarations, which are
> >explicitly not permitted in the grammar:  no whitespace before either of
> >those, and the same syntax elsewhere in the document body (e.g. after the
> >newline) counts as an illegal processing instruction
> 
> I'm not really an SGML/XML wizard; could you elaborate? All the actual
> validation is passed on to SP, but it's possible we can either force SP to
> catch it or do it internally in the pre-processing pass. I'm just not clear
> on what it is that is getting passed through that shouldn't be. Is it blank
> lines before the "<?xml?>" bit? Whitespace inside multiline processing
> instructions?

I don't know what you mean by "passed through" since that'd imply I knew
more about the code for that validator.  Same bug, different words ... the
following isn't reported as a fatal error:

	Line 1:		
	Line 2:		<?xml version="1.0"?>
	Lines 3-N:	irrelevant

That malformed input is ignored, which is the problem.  I didn't try any
related inputs to try to characterize the bug any further.

It does make me wonder how many other illegal XML constructs are passing
through there.  Has anyone run the OASIS/NIST XML test cases against the
processor you're using?  I've got an updated copy, which a few folk have
sanity checked.  (Last month's overdue publication of XML errata caused
a few of the original cases to need updating.  It now lists three validity
constraints that weren't in the original XML 1.0 spec.)


> >which _must_ cause a fatal error.
> 
> Well, in general, throwing fatal exceptions isn't really usefull behaviour
> for a validation tool. Is there some reason this should be changed in tis
> case?

To report the error?  Absolutely -- it's telling folk that seriously
broken XML is valid, when it's not even well formed.

It should at least be telling them that any conformant XML tool will
refuse to even _read_ the document.  As it is, it's giving them a W3C
stamp of approval -- wrong answer!


> I'm aware that the XML spec demands that you terminate processing if the
> document is not well formed, but should this even be extended to tools
> whose aim is to help you _make_ the document well formed? Wouldn't that be
> counter-productive?

The XML spec actually demands that you stop reporting anything except
errors ... which allows for what you're implying.

Of course, after the first non-recoverable error, any other error report
will be questionable ... though I don't think it'll be like in some
computer languages I've worked with, where the _real_ error might be
the fifth one to get reported in lexical order!!  ;-)

- Dave
Received on Sunday, 20 February 2000 19:31:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:53 GMT