W3C home > Mailing lists > Public > www-validator-cvs@w3.org > July 2007

validator/httpd/cgi-bin check,1.548,1.549

From: Olivier Thereaux via cvs-syncmail <cvsmail@w3.org>
Date: Thu, 26 Jul 2007 02:10:51 +0000
To: www-validator-cvs@w3.org
Message-Id: <E1IDsoV-0002oJ-V0@lionel-hutz.w3.org>

Update of /sources/public/validator/httpd/cgi-bin
In directory hutz:/tmp/cvs-serv10025

Modified Files:
	check 
Log Message:
change of strategy for the use of XML::LibXML
apparently, even loading a local catalog does not stop it from 
fetching tons of schema/entities stuff (even in non-validating mode)

Using load_ext_dtd(0) solves the issue, but the parser would complain about unknown entities 
=> filtering that at the post-parsing level.



Index: check
===================================================================
RCS file: /sources/public/validator/httpd/cgi-bin/check,v
retrieving revision 1.548
retrieving revision 1.549
diff -u -d -r1.548 -r1.549
--- check	25 Jul 2007 17:41:54 -0000	1.548
+++ check	26 Jul 2007 02:10:48 -0000	1.549
@@ -632,8 +632,10 @@
 
   my $xmlparser = XML::LibXML->new();
   $xmlparser->line_numbers(1);
-  # loading the XML catalog for entities resolution
-  $xmlparser->load_catalog( File::Spec->catfile($CFG->{Paths}->{SGML}->{Library}, 'xml.soc') );
+  $xmlparser->validation(0);
+  $xmlparser->load_ext_dtd(0);
+  # [NOT] loading the XML catalog for entities resolution as it seems to cause a lot of unnecessary DTD/entities fetching
+  #$xmlparser->load_catalog( File::Spec->catfile($CFG->{Paths}->{SGML}->{Library}, 'xml.soc') );
   my $xml_string = join"\n",@{$File->{Content}};
   # the XML parser will check the value of encoding attribute in XML declaration
   # so we have to amend it to reflect transcoding. see Bug 4867
@@ -691,7 +693,9 @@
         $err->{type} = "E";
         $err->{msg}  = $xmlwf_error_msg;
 
-        # ...
+        # The validator will sometimes fail to dereference entities files
+        # we're filtering the bogus resulting error
+        next if ($err->{msg} =~ /Entity '\w+' not defined/);
         push (@xmlwf_error_list, $err);
         $xmlwf_error_line = undef;
         $xmlwf_error_col = undef;
@@ -768,7 +772,6 @@
              { name => 'Parse Mode Factor', value => $File->{ModeChoice} },
              { name => 'Parser', value => $parser_name },
              { name => 'Parser Options', value => join " ", @spopt },
-
             ],
            );
    $File->{Templates}->{SOAP}->param(opt_debug => $DEBUG);
Received on Thursday, 26 July 2007 02:10:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:54:58 GMT