W3C home > Mailing lists > Public > www-validator-cvs@w3.org > December 2009

validator/httpd/cgi-bin check,1.751,1.752

From: Ville Skytta via cvs-syncmail <cvsmail@w3.org>
Date: Mon, 14 Dec 2009 20:44:51 +0000
To: www-validator-cvs@w3.org
Message-Id: <E1NKHml-0007Pn-FX@lionel-hutz.w3.org>
Update of /sources/public/validator/httpd/cgi-bin
In directory hutz:/tmp/cvs-serv28065/httpd/cgi-bin

Modified Files:
	check 
Log Message:
Make LibXML transcoding-passing regex stricter and more readable.


Index: check
===================================================================
RCS file: /sources/public/validator/httpd/cgi-bin/check,v
retrieving revision 1.751
retrieving revision 1.752
diff -u -d -r1.751 -r1.752
--- check	12 Dec 2009 20:06:36 -0000	1.751
+++ check	14 Dec 2009 20:44:49 -0000	1.752
@@ -619,9 +619,14 @@
         # the XML parser will check the value of encoding attribute in XML
         # declaration so we have to amend it to reflect transcoding.
         # see Bug 4867
-        $xml_string =~ s/(<\?xml.*)
-  (encoding[\x20|\x09|\x0D|\x0A]*=[\x20|\x09|\x0D|\x0A]*(?:"[A-Za-z][a-zA-Z0-9_-]+"|'[A-Za-z][a-zA-Z0-9_-]+'))
-  (.*\?>)/$1encoding="utf-8"$3/sx;
+        $xml_string =~ s/
+               (^<\?xml\b[^>]*[\x20\x09\x0D\x0A])
+               (encoding[\x20\x09\x0D\x0A]*=[\x20\x09\x0D\x0A]*
+                   (?:(["'])[A-Za-z][a-zA-Z0-9_-]+\3)
+               )
+               ([^>].*\?>)
+           /$1encoding="UTF-8"$4/sx;
+
         eval { $xmlparser->parse_string($xml_string); };
         $xml_string = undef;
         my $xml_parse_errors_line = undef;
Received on Monday, 14 December 2009 20:44:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:55:16 GMT