W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2000

HTML Tidy: Wishes & Bug

From: Axel Beckert <abe@cs.uni-sb.de>
Date: Thu, 21 Dec 2000 20:54:36 -0500 (EST)
To: Dave Raggett <html-tidy@w3.org>
Message-ID: <20001222025423.C12305@fsinfo.cs.uni-sb.de>
Hi!

Due to heavily using server side scripting, I've got several wishes
regarding Tidy:

+ Tidy should have a possiblity to ignore all SSI (server-side
  include) commands inside HTML documents, especially, if they
  occur in some unsual positions like parameter values.

  SSI Syntax: <!--#command [param|param=value]* -->

  See http://httpd.apache.org/docs/mod/mod_include.html for details.

  Example: Tidy argues (better to say, it tries to argue) about
  <A HREF="foobar<!--#if expr="x$QUERY_STRING != x" -->?<!--#echo var="QUERY_STRING" --><!--#endif -->">foobar</A>

  But instead of saying, where the error exactly is, it crashes. The
  out after trying to parse a line like above looks like this:

line 310 column 16 - Warning: <a> unknown attribute value "Segmentation fault

  (OK, for an HTML parser this line looks really awkward. :-) If you
  need any more details on this bug, just tell me, what you exactly
  need...

+ Tidy should have a possiblity to ignore all Embperl code inside HTML
  documents.

  Embperl Syntax: [- commands -], [+ commands +], [* commands *], 
                  [! commands !], [$ meta-commands $] and [# comments #]

  See http://perl.apache.org/embperl/Embperl.pod.cont.html for details.

  Problem: Tidy modifies PERLs <=, <, >, >=, <=> operators as well as
  the <> operator, used for reading from input streams, e.g. <STDIN>.

  Due to Embperl can be configured to reconstruct such things as &lt;
  to valid PERL code, this is not so important. But coding and
  debugging would be easier, if Tidy would not screw up the Embperl
  code, especially if you develope and test code not being tiddied,
  but the server executes the tiddied version of that code.

+ Tidy should have an option for not stripping of characters with
  ascii code lower than 32. This is useful, if you use embedded PERL
  code (doesn't matter if ePerl or EmbPerl) in HTML documents, because
  of PERL having single character variable names being control
  characters, e.g. $^O for the operating system or $^X for the file
  name of the PERL binary.

  See the perlvar man page for details.

  Due to there being the work-around with "use English;", this is not
  so important...

P.S.: Thanks for providing such useful tool as Tidy. :-)

		Regards, Axel
-- 
Axel Beckert - abe@cs.uni-sb.de - http://abe.home.pages.de/
Student of Computer Science, University of Saarland (Germany)
Artificial Intelligence Laboratory (AI Lab), Prof. Dr. W. Wahlster;
WWW-Administrator IBFI Schloss Dagstuhl; Students Representative
Received on Saturday, 23 December 2000 03:10:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:45 GMT