Good Morning,

i'm trying to w3c validate my pages written in jsp language.

with a log file in the standard apache combined log format all the records with any querystrings are not validated (?myparam=myparamvalue)

example: a log file with 2 log lines

127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /default1.jsp?test=test HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /default.jsp HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"

the first line (default1.jsp) has a querystring param and the robot doesn't try to validate the page. In debug mode i can see it clearly


Any suggestions? Thanks

I attach my config file

Follows the program output in debg mode

logprocess.pl -f /root/logprocess.xxxxxxxxxxxxxx.conf -d
showing general config :
Module  W3C::LogValidator::LinkChecker
Module  LogProcessor
 tmpfile_referers /root/tmp/LogValidator-iJ9InCpL
 tmpfile /root/tmp/LogValidator-EgijP2mA
 verbose 3
 LogFiles ARRAY(0x80620e4)
 MaxDocuments 0
 UseValidationModule W3C::LogValidator::HTMLValidator
 RefererMatch .*
 DirectoryIndex default.jsp index.html index.htm index Overview Overview.html Overview.xhtml
 tmpfile_mime_types /root/tmp/LogValidator-vaZmoZw3
 ServerAdmin anthesisvil@anthesi.it
 MaxInvalid 20
 tmpfile_HTTP_codes /root/tmp/LogValidator-v3E9j34Y
 ServerName www.provincia.roma.it
 QuietIfNoReport 0
 LogType HASH(0x82dc360)
 EntriesPerLogfile 500000
 DocumentRoot /tmp/www.provincia.roma.it/ROOT/
 UseOutputModule W3C::LogValidator::Output::Raw
Module  W3C::LogValidator::CSSValidator
Module  W3C::LogValidator::SurveyEngine
Module  W3C::LogValidator::Basic
Module  W3C::LogValidator::HTMLValidator
 ValidatorHost validator.anthesi.com
 AuthorizedExtensions .html .xhtml .phtml .htm .shtml .php .svg .xml ..jsp /
 ValidatorPostString \;output=xml
 MaxDocuments 0
 MaxInvalid 20
 ValidatorString /check?uri=
 ValidatorMethod GET
 ValidatorPort 80
End of config

Reading logfiles:
        /root/xxxxxxxxxxxx.log.combined...
Done!
Now using the HTML Validator module...

nothing to exclude
        processing #1 http://www.provincia.roma.it/... Valid!
http://xxxxxxxxxxxxxxxx:80/check?uri=http%3A%2F%2Fwww.provincia.roma.it%2F\;output=xml :
HTTP/1.1 200 OK
Connection: close
Date: Mon, 23 Jul 2007 10:11:18 GMT
Server: Apache/2.0.54 (Mandriva Linux/PREFORK-13mdk)
Content-Type: application/xml; charset=UTF-8
Client-Date: Mon, 23 Jul 2007 10:11:19 GMT
Client-Peer: 192.168.251.101:80
Client-Response-Num: 1
X-W3C-Validator-Errors: 0
X-W3C-Validator-Recursion: 1
X-W3C-Validator-Status: Valid

Done!
invalid_census 0

************************************************************************
Results for module HTMLValidator
************************************************************************
Here are the most popular invalid document(s) that I could find in the
logs for xxxxxxxxxxxxxx.

 Rank   Hits   #Error(s)   Address
------ ------ ----------- ---------

I couldn't find any invalid document in this log. Congratulations!
************************************************************************
but it's not true!!!!!only the second log line has been tested




-- 
__________________

Alessandro Pedrotti


Anthesi s.r.l.
via M.Misone, 14 - Riva del Garda (ITALY)
Tel. +39 0464 553300 Fax. +39 0464 559010

Http://www.anthesi.it - www.isiportal.com


-----------------------------------------------------------------------
Nota di riservatezza : Il presente messaggio, corredato dei relativi allegati, contiene informazioni da considerarsi strettamente riservate,ed è destinato esclusivamente al destinatario sopra indicato, il quale è l'unico autorizzato ad usarlo, copiarlo e, sotto la propria responsabilità, diffonderlo.Chiunque ricevesse questo messaggio per errore o comunque lo leggesse senza esserne legittimato è avvertito che trattenerlo, copiarlo, divulgarlo, distribuirlo a persone diverse dal destinatario è severamente proibito, ed è pregato di rinviarlo immediatamente al mittente distruggendone l'originale. Grazie.
Confidentiality Notice : This message, together with its annexes, contains information to be deemed strictly confidential and is destined only to the addressee(s) identified above who only may use, copy and, under his/their responsibility, further disseminate it. If anyone received this message by mistake or reads it without entitlement is forewarned that keeping, copying, disseminating or distributing this message to persons other than the addressee(s) is strictly forbidden and is asked to transmit it immediately to the sender and to erase the original message received. Thank you.