W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2000

Re: Problem with SSI in form elements

From: Sebastian Lange <lange@cyperfection.de>
Date: Fri, 25 Aug 2000 09:40:49 +0200
Message-Id: <>
To: html-tidy@w3.org
An shtml file does not need to be "syntactically legal" to the DTD, as long 
as the parsed OUTPUT of that file is, right?

I think what Chris asked for was some basic support for Server Side 
Includes, similar to Tidy's support for PHP and *cough* ASP...

Chris... as only workaround for the meantime, I suggest that you pre- and 
post-process the files parsed by Tidy.

I did not test the following code, and there certainly are better ways... 
but I think it should work. Perl5, by the way:

In PREPROCESS, you have to do something like:

$to_be_tidied_data = "file content sent to tidy";
# replace all <!--#...--> blocks with a placeholder string
my @SSIStack;
my $SSIBlock;
my $i = 0;
while ($to_be_tidied_data =~ m/(<--\#.*?-->)/gsi) {
   $SSIStack[$i] = defined($1) ? $1 : "";
   $to_be_tidied_data =~ s/<--\#.*?-->/SSIStackPlaceHolder\[$i\]/si;

In POSTPROCESS, do this then:

$tidied_data = "output coming from tidy";
# replace all placeholder strings with their appropriate <!--#...--> blocks
for ($i=0; $i <= scalar @SSIStack; $i++) {
   $tidied_data =~ s/SSIStackPlaceHolder\[$i\]/$SSIStack[$i]/s;

At 15:19 25.08.2000 +1200, Richard A. O'Keefe wrote:
>         If you use SSI in form elements like this:
>         <input type="text" name="referer"
>                value="<!--#echo var="HTTP_REFERER"-->" />
> >From the /> at the end, I see this is XHTML.
>Looking in the XML 1.0 spec (actually the latest revised draft),
>we find in section 2.3 that
>[10] AttValue ::= '"' ([^<&"] | Reference)* '"'
>                |  "'" ([^<&'] | Reference)* "'"
>That is, within an attribute value,
>     ampersand is reserved for introducing character/entity references,
>     the quotation mark you started with is forbidden, and
>I don't really see the point in this restriction.  After all, we're
>talking about the inside of a string, and there is nothing else that
>less than could mean there.  Perhaps it was precisely so that people
>*could* write stuff like this.
>What's more, if we look at it, what we see is
>         [value] [=] ["<!--#echo var="] [HTTP_REFERER] ["-->"]
>which would not be syntactically legal XHTML even if XML *did*
>allow less than signs inside attribute values, because it would
>be two attribute bindings, one of them missing an equal sign.
>Would it be possible to change this to
>         value='<!--#echo var="#HTTP_REFERER"-->'
>where the quotation marks are different?

Sebastian Lange
Maybe the first chat site that validates as HTML
4.0 even though user input may contain HTML codes.

Courtesy to Dave Raggett's HTML Tidy:

Tidy your documents ONLINE:
Received on Friday, 25 August 2000 03:45:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:48 UTC