- From: Earl Hood <ehood@hydra.acs.uci.edu>
- Date: Thu, 17 Jun 1999 13:11:55 -0700
- To: www-validator@w3.org
On June 17, 1999 at 00:13, =?ISO-8859-1?Q?Claus_F=E4rber?= wrote:
> Earl Hood <ehood@hydra.acs.uci.edu> schrieb/wrote:
> > (I'm unclear why the document is split into an array). If the data is
> > passed in as a single string, a comment stripping regex:
> >
> > s/<!--([^-]|-[^-])*--\s*>//go;
>
> This not true either. It would only be valid if there were no elements
> that could contain CDATA.
Since we are restricting ourselves to HTML and XML, there are no
CDATA elements. If I remember correctly, XML does not support CDATA
elements. Now, CDATA marked sections would be a more valid argument.
However, since the goal of the code in question is to just find the
doctype declarations, CDATA marked sections is a non-issue.
BTW, within the context of just try to find the doctype declaration,
any CDATA elements would not matter either.
> And it won't catch legal comment syntax:
>
> <!-- comment 1 --
> -- comment 2 -->
That is what I get when I cut-n-haste from some code w/o checking the
context the regex was being used. Here is a probably a more
appropriate regex:
s/<!(?:--(?:[^-]|-[^-])*--\s*)+>//go
--ewh
Received on Thursday, 17 June 1999 16:12:03 UTC