double quoted attribute value deleted

Hi, 

Using tidy to clean up an html file before I parse it. I thought Tidy would
make the process smoother. I was looking for an automated way to report and
fix for well-formed-ness... I am not an html standards expert I'm sure Tidy
has good reason for doing the following by default.

Given a file with content only:  <table class=""datatable""></table>
I do:  $ tidy  /home/g/Desktop/scrapes/xmlwf2.xml
I get back:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<meta name="generator" content=
"HTML Tidy for Linux/x86 (vers 7 December 2008), see www.w3.org">
<title></title>
</head>
<body>
<table class=""></table>
</body>
</html>

So, "datatable" is removed. Why? I ask because tidy removed content here and
I'm worried about that. Is there a way to make tidy not do that? 

And, when I parse I need all the anchors and "signs along the road" I can
get as flags if you know what mean...

Thanks,

Lee G.


-- 
View this message in context: http://old.nabble.com/double-quoted-attribute-value-deleted-tp30268141p30268141.html
Sent from the w3.org - html-tidy mailing list archive at Nabble.com.

Received on Monday, 22 November 2010 07:37:13 UTC