Re: PHP Script - Validator Soap Request Producing Endless Slope?

On 10/26/07, Grunzwanzling@gmx.li <Grunzwanzling@gmx.li> wrote:
>
> Hi,
>
> I am new to this mailing and have not subscribed yet as I just want to
> ask a question and did not know where else to look for help.
>
> Currently, I want to build a PHP plugin for Wordpress that simply
> checks on loading of each page if it is XHTML 1.0 valid.

This sounds like a noble effort, but I really think you should
re-consider this implementation. The contents of the page should not
change on each load, therefore, checking validity *once*, only when
new posts are published should be sufficient. I think we all should be
careful when considering the use of a shared resource, as in the case
of the W3C's html validation service. Re-validation on every load of
the same page is excessive and unnecessary if the content is the same.

I would think there must be some way to only run a plugin when a new
post is added, and store the results of the validation once.

> Therefore, I
> call a function from the sidebar that establishes a connection to the
> validator's SOAP API via the Snoopy() class (somewhat a "virtual
> browser" class). If the result via SOAP returns true, a button will be
> shown, if not, there will be a text that the page is invalid.
>
> The script works SOMETIMES, but mostly it times out. When you regard
> the code below, you will find a $validation_time. Normally (without
> the script implemented), it takes about 3 secs for the validator to
> check the code on my site www.saphod.net. With the PHP-Script
> activated, it takes AGES - and by calling the php script within my
> page it takes even longer.
>
> Here is what I believed so far:
> As the function calls the validator ON LOADING, the validator will
> eventually call itself, thus producing an endless slope.
>
> For this reason, I implemented an if-condition that gets the
> IP-address of the remote host  - in case of the validator, this should
> be 128.30.52.49, right?

I think this is incorrect, the W3C maintains multiple servers running
the validator service. I do not know all the IP addresses, but in this
case you could check the user agent string for W3C_Validator -----
$_SERVER['HTTP_USER_AGENT']

>
> Well, it made no difference. Still, the script takes AGES.
>
> Plus:
> reviewing my access.log, I realize that I get LOTS of GET-Requests
> from the validator at the same time. This sounds to me as if more than
> one instance tries to validate the code.
>
> I do not know where to look next.
> So, I printed the code of my script below.
> I hope somebody can help me out.
>
> Please keep in mind that I am not (yet) subscribed, so please drop me
> a copy of your email.
>
> Thanks so much in advance!
>
> Best wishes,
> Marco Luthe
> www.saphod.net
>
>
> ******** Script code follows ******************
> <?php
>
> /*
> Show XHTML validity
> most of the code is from the batch validator plugin by
> http://wordpress.designpraxis.at/plugins/batch-validator/
> */
>
> function XHTML_validate_page($timeout_in_sec=NULL){
>
>        // when given, change timeout, otherwise do not change anything and
> stay default
>        if (!empty($timeout_in_sec)) {
>                $client->read_timeout = $timeout_in_sec;
>                $client->_fp_timeout = $timeout_in_sec;
>        }
>
>        // Define URI variables
>        $w3validator = "http://validator.w3.org/check?uri=";
>        $uri = $w3validator . urlencode($_SERVER['HTTP_HOST'] .
> $_SERVER['REQUEST_URI']);
>        $uri_soap = $uri . "&output=soap12";
>
>        // include the Snoopy class which simulates a web browser
>        require_once (ABSPATH . WPINC . '/class-snoopy.php');
>
>        // create a new Snoopy object (our "virtual browser")
>        $client = new Snoopy();
>
>        // start time of validation
>        $validation_start = microtime(true);
>
>        // Validate the URI with validator.w3.org using the SOAP API (see
> http://validator.w3.org/docs/api.html for more info)
>        // But: if it is the validator itself, then do nothing! Otherwise,
> there might be an endless slope and a timeout
>        if (@$client->fetch($uri_soap) && !($_SERVER['REMOTE_ADDR'] ==
> "128.30.52.49" || $_SERVER['REMOTE_HOST'] = "validator.w3.org")) {
>
>                // Save the results
>                $data = $client->results;
>
>                // Just for debugging reasons
>                /*
>                @$client->fetchtext($uri_soap);
>                $data_text = $client->results;
>                */
>
>                // Search for "m:validity" and "m:doctype" in the results (again,
> see the validator API)
>                $data = explode("\n",$data);
>
>                // Can I optimize these two foreach slopes into one?
>                foreach ($data as $buffer) {
>                        if (eregi("m:doctype",$buffer)) {
>                                $doctype = trim(strip_tags($buffer));
>                                break;
>                        }
>                }
>                foreach ($data as $buffer) {
>                        if (eregi("m:validity",$buffer)) {
>                                $validity = trim(strip_tags($buffer));
>                                break;
>                        }
>                }
>
>                // again: for debugging reasons
>                /*
>                echo "Results:\n";
>                echo "Remote Adress: " . $_SERVER['REMOTE_ADDR'];
>                echo "Validity: " . $validity;
>                echo "Doctype: " . $doctype;
>                echo eregi("XHTML 1.0",$doctype);
>                echo "Text: \n" . print($data_text);
>                */
>
>                if (!($client->timed_out)) {
>                        $validation_end = microtime(true);
>                        if($validity == "true" && eregi("XHTML 1.0",$doctype)) {
>                                // Page is valid XHTML 1.0 transitional when $validity is true and
> doctype is like "XHTML 1.0", so use the XHTML 1.0 icon as shown on
> w3.org
>                                echo "<a href=\"" . $uri . "&amp;ss=1\" target=\"blank\"><img
> src=\"http://www.w3.org/Icons/valid-xhtml10\" alt=\"Valid XHTML 1.0
> Transitional\" height=\"31\" width=\"88\" /></a>";
>                        } else {
>                                // Page is not valid XHTML 1.0, print a message
>                                $error_message="Page is not valid XHTML 1.0<br />Automatic
> validation failed.<br />[<a href=\"" . $uri . "&amp;ss=1\"
> target=\"_blank\">check manually</a>]";
>                                echo $error_message;
>                        }
>                        $validation_time = round($validation_end - $validation_start);
>                        echo "<br />(" . $validation_time . " sec)";
>                } else {
>                        $timeout_message="Is this page valid XHTML 1.0?<br />Automatic
> validation timed out.<br />[<a href=\"" . $uri . "&amp;ss=1\"
> target=\"_blank\">check manually</a>]";
>                        echo "<small>" . $timeout_message . "</small>";
>                }
>        } else {
>                echo "An error occured while<br />trying to automatically<br
> />validate this site.<br />[<a href=\"" . $uri . "&amp;ss=1\"
> target=\"_blank\">check manually</a>]";
>        }
> }
>
> ?>
>

You might also look into the PEAR package Services_W3C_HTMLValidator,
which you can simply use something like:
$validator = new Services_W3C_HTMLValidator();
$result = $validator->validate($uri);
if ($result->isValid()) echo 'The page is valid.';

http://pear.php.net/package/Services_W3C_HTMLValidator/


-- 
-Brett Bieber

http:saltybeagle.com aim:ianswerq

Received on Sunday, 28 October 2007 16:28:54 UTC