W3C home > Mailing lists > Public > www-validator@w3.org > August 2011

RE: W3 validator gives Different validation results when validated through web service soap12 (in javascript using XMLHttpRequest) against online over the validator.w3.org site

From: <mahavir.patil@accenture.com>
Date: Fri, 5 Aug 2011 12:21:09 +0530
To: <iecustomizer@hotmail.com>, <www-validator@w3.org>
Message-ID: <6B68EAE6E593704C828DD7E2A96F8A5DBBEEFEE0CB@INDXM3111.dir.svc.accenture.com>
Hi,

Thanks for the reply.

May be I was not clear enough while explaining my problem. Let me try to explain it in more details.

I am not trying to send the entire page html to W3C validator service to validate. Instead, I have a written segment of html in a control on the page. I have a button on that control. This button calls the javascript function to send the HTML content written in that control to W3C validator service to validate. Here is the javascript code which does this:

var encodedContent = getRadWindow().ClientParameters;        //this fetched the HTML content written in the control

      xmlHttp = CreateXMLHttpRequest();

      var params = "fragment=" + encodedContent + "&output=soap12";   // I am specifying optput to be soap12

      xmlHttp.open("POST", "http://validator.w3.org/check", false);

      xmlHttp.setRequestHeader("Content-length", params.length);

      xmlHttp.send(params);


As its clear from the above code, I am retrieving SOAP12 response from the W3C service.

Now the problem is, if I take a sample HTML content which I pass through the javascript method above and try it on validator.w3.org as direct input, the number of errors and warnings I am getting is different than what I am getting in SOAP response from javascript method.

So there is no confusion about what HTML content is being send in either method as I am not sending page HTML but a section of HTML which is explicitly written in an control on the page. So the content is the same, but the method of validation is different, one is programmatic (javascript) web service call with SOAP12 as response, and other is manual operation where I am directly trying the validation online on validator.w3.org site as direct input.

So I want to understand, whether I am making a mistake in constructing the web service call in the above javascript method (do I need to make some change in that like adding another parameter to the request or something like that?) OR there is a problem with the W3C validator service itself which send different results for online service and programmatic SOAP12 request.


For an example: I have tried a sample HTML which you can find in the attached text file.

When I try it by the javascript method above, I get a SOAP response with below data (I have scanned and structured the data in readable format)

7 errors and 3 warnings!
Errors:-
1) at line No 1 : character "R" not allowed in prolog
RadEditor for ASP.NET AJAX
2) at line No 2 : character "</" not allowed in prolog
...r<FONT color=#4f6128> </FONT>is not simply an HTML<A href="#HT...
3) at line No 2 : character "E" not allowed in prolog
...f="#HTMLDescription"><SUP>1</SUP></A> Editor. It is what Microsoft chose to us...
4) at line No 2 : character "a" not allowed in prolog
...chNet, <STRONG>MCMS</STRONG> and even as an alternative to the defaul...
5) at line No 2 : character "o" not allowed in prolog
...he same: clean <STRONG>XHTML</STRONG> output, fast rendering, <STRONG>widest c...
6) at line No 3 : no document type declaration; will parse without validation
<UL style="WIDTH: 350px
from our FAQ, or use the validator's Document Type option to validate your document against a specific Document Type.
7) at line No 3 : an entity end in a literal must terminate an entity referenced in the same literal
<UL style="WIDTH: 350px

Warnings:-
1) Unable to Determine Parse Mode!
2) No Character encoding declared at document level
3) Using Direct Input mode: UTF-8 character encoding assumed

And when I try the same content on validator.w3.org site as direct input there, I get 21 Errors, 7 warning(s). You can try the content on the site.

Please help me to understand why the difference is.

Thanks in Advance.
  - Mahavir
________________________________
From: Rob^_^ [mailto:iecustomizer@hotmail.com]
Sent: Friday, August 05, 2011 7:43 AM
To: Patil, Mahavir
Subject: Re: W3 validator gives Different validation results when validated through web service soap12 (in javascript using XMLHttpRequest) against online over the validator.w3.org site

Hi Mahavir,

The 'source' is read from the server. The actual page source may contain document.write or DOM/js statements which alter the 'computed' markup.

You would extract the computed markup with
document.documentElement.innerHTML
(however different browsers sanitize the computed source differently.... you should normalize your tag's case and where xhtml syntax is used your should ensure that self-closing tags (returned in the .innerHTML) are closed.

Only IE9 returns the document.doctype, I use this for IE browsers to extract the doctype


if(oDoc.documentMode){

         if( oDoc.documentMode <9){

                 if(oDoc.all[0].nodeType==8){return '<!' + oDoc.all[0].nodeValue + '>';}else{return '';};

                 //return res;

                 }

         else{

                 if(oDoc.doctype!==null){

                 res='<!DOCTYPE ' + oDoc.doctype.name;

                 if(oDoc.doctype.publicId !=null){res+=  ' PUBLIC \"' + oDoc.doctype.publicId + '\"';}

                 if(oDoc.doctype.systemId !=null){res+= ' \"' + oDoc.doctype.systemId + '\"';}

                 res+= '>';

                 //alert(re.exec(res));

                 return res;

                 }

                 else{return '';}

                 //return oDoc.all[0].nodeType==8 ? res:'None Specified';

                 }

}

else{

                 //alert(oDoc.all[0].nodeValue);

                 if(oDoc.all[0].nodeType==8){return '<!' + oDoc.all[0].nodeValue + '>';}else{return '';};

         }

}



To see computed source test with something like

<html>

<head><title>test</title></head><body><script type="text/javascript">document.write('<span>hello world<\/span>');</script></body>

</html>

you should see that the computed source is

<html>

<head><title>test</title></head><body><script type="text/javascript">document.write('<span>hello world<\/span>');</script><span>hello world</span></body>

</html>

From: mahavir.patil@accenture.com<mailto:mahavir.patil@accenture.com>
Sent: Thursday, August 04, 2011 8:47 PM
To: www-validator@w3.org<mailto:www-validator@w3.org>
Subject: W3 validator gives Different validation results when validated through web service soap12 (in javascript using XMLHttpRequest) against online over the validator.w3.org site

Hi,

I am using W3C html validator service to validate my html content. I am doing it programmatically by calling the service (soap12 response) through javascript XMLHttpRequest using the below code:

var encodedContent = '<some html content>';
xmlHttp = CreateXMLHttpRequest();
    var params = "fragment=" + encodedContent ;
    xmlHttp.open("POST", "http://validator.w3.org/check", false);
    xmlHttp.setRequestHeader("Content-length", params.length)
    xmlHttp.send(params);

What I am finding is, for a similar html content the validation results I am getting within the soap response from the service is different than what I am getting on the validator.w3.org as I pass the same HTML content as direct input on the site. Both provide different number of errors and warnings. Some are common but the difference sometime is like soap call is giving 2 errors and on site its giving 20.

Any hint?

Regards,
Mahavir


________________________________
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the email by you is prohibited.



Received on Friday, 5 August 2011 06:52:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:48 GMT