- From: Earl Hood <ehood@hydra.acs.uci.edu>
- Date: Thu, 17 Jun 1999 13:11:55 -0700
- To: www-validator@w3.org
On June 17, 1999 at 00:13, =?ISO-8859-1?Q?Claus_F=E4rber?= wrote: > Earl Hood <ehood@hydra.acs.uci.edu> schrieb/wrote: > > (I'm unclear why the document is split into an array). If the data is > > passed in as a single string, a comment stripping regex: > > > > s/<!--([^-]|-[^-])*--\s*>//go; > > This not true either. It would only be valid if there were no elements > that could contain CDATA. Since we are restricting ourselves to HTML and XML, there are no CDATA elements. If I remember correctly, XML does not support CDATA elements. Now, CDATA marked sections would be a more valid argument. However, since the goal of the code in question is to just find the doctype declarations, CDATA marked sections is a non-issue. BTW, within the context of just try to find the doctype declaration, any CDATA elements would not matter either. > And it won't catch legal comment syntax: > > <!-- comment 1 -- > -- comment 2 --> That is what I get when I cut-n-haste from some code w/o checking the context the regex was being used. Here is a probably a more appropriate regex: s/<!(?:--(?:[^-]|-[^-])*--\s*)+>//go --ewh
Received on Thursday, 17 June 1999 16:12:03 UTC