W3C home > Mailing lists > Public > www-validator@w3.org > June 1999

Re: WWW-Validator Bug (response to private mail on other topic)

From: Earl Hood <ehood@hydra.acs.uci.edu>
Date: Thu, 17 Jun 1999 13:11:55 -0700
Message-Id: <199906172011.NAA06118@geneva.acs.uci.edu>
To: www-validator@w3.org
On June 17, 1999 at 00:13, =?ISO-8859-1?Q?Claus_F=E4rber?= wrote:

> Earl Hood <ehood@hydra.acs.uci.edu> schrieb/wrote:
> > (I'm unclear why the document is split into an array).  If the data is
> > passed in as a single string, a comment stripping regex:
> >
> > 	s/<!--([^-]|-[^-])*--\s*>//go;
> 
> This not true either. It would only be valid if there were no elements  
> that could contain CDATA.

Since we are restricting ourselves to HTML and XML, there are no
CDATA elements.  If I remember correctly, XML does not support CDATA
elements.  Now, CDATA marked sections would be a more valid argument.
However, since the goal of the code in question is to just find the
doctype declarations, CDATA marked sections is a non-issue.

BTW, within the context of just try to find the doctype declaration,
any CDATA elements would not matter either.

> And it won't catch legal comment syntax:
> 
> <!-- comment 1 --
>   -- comment 2 -->

That is what I get when I cut-n-haste from some code w/o checking the
context the regex was being used.  Here is a probably a more
appropriate regex:

    s/<!(?:--(?:[^-]|-[^-])*--\s*)+>//go

--ewh
Received on Thursday, 17 June 1999 16:12:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:52 GMT