- From: Rajeev J <jrajeev@firstam.com>
- Date: Fri, 18 Jul 2003 17:45:38 +0530
- To: html-tidy@w3c.org
- Message-ID: <007201c34d26$4e6d7c80$4297a8c0@blr.res.firstam.com>
HI
There have been a lot of discussion on this in the mailing list, but i
could find the solution for my problem, where as i got the same issue
been raised by other user, but since i could not locate the answer , i
am posting it again,Sorry for the trouble.
Basically i am trying to convert HTML to XML and then read the XML
document through XPath
for which i am using Tidy and below is the code (Java script)which i use
for that
========================================================================
============
WshShell = new ActiveXObject("WScript.Shell");
var xmlhttp = new ActiveXObject("MSXML2.ServerXMLHTTP");
xmlhttp.open("GET", "http://www.rediff.com", false ); // , strUserID,
strPassword );
xmlhttp.send();
var tidyObj = new ActiveXObject( "TidyCOM.TidyObject" );
tidyObj.Options.OutputXML = true;
tidyObj.Options.QuoteNbsp = false;
tidyObj.Options.QuoteAmpersand = true;
tidyObj.Options.Clean = true;
tidyObj.Options.DropFontTags = true;
tidyObj.Options.CharEncoding = 2; // Latin 1 = ISO-8859-1
// Causes parsing errors to be logged, (very useful during development)
tidyObj.Options.ErrorFile = 'tidy.err';
strReturn = tidyObj.TidyMemToMem( xmlhttp.responseText) ;
WshShell.Popup( strReturn);
var ndFormInputs = strReturn.selectNodes(
'//form//input|//form//select' );
for( i = 0; i < ndFormInputs.length; i++ ) {
Wshshell.Popup( ndFormInputs(i).nodeName )
}
========================================================================
====================
but when i get the response in XMl i only see a meta tag added to normal
HTML code and i would be able to read
data from that
Here is want I want to do:
html (url)-> xml -> xsl or xpath -> xml (DOM or file)
Please provide me links or support on this
your help would highly appreciated.
Regards,
Rajeev
Attachments
- image/bmp attachment: r.bmp
Received on Friday, 18 July 2003 08:18:42 UTC