W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2010

Re: double quoted attribute value deleted

From: Fred Bone <Fred.Bone@dial.pipex.com>
Date: Mon, 22 Nov 2010 09:39:39 -0000
To: leegold <leegold@speedymail.org>
CC: html-tidy@w3.org
Message-ID: <4CEA3A5B.19032.16CE6B9E@Fred.Bone.dial.pipex.com>
On 20 November 2010 at 13:22, leegold said:

> Hi, 
> 
> Using tidy to clean up an html file before I parse it. I thought Tidy
> would make the process smoother. I was looking for an automated way to
> report and fix for well-formed-ness... I am not an html standards expert
> I'm sure Tidy has good reason for doing the following by default.
> 
> Given a file with content only:  <table class=""datatable""></table>
> I do:  $ tidy  /home/g/Desktop/scrapes/xmlwf2.xml
> I get back:
> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
> <html>
> <head>
> <meta name="generator" content=
> "HTML Tidy for Linux/x86 (vers 7 December 2008), see www.w3.org">
> <title></title>
> </head>
> <body>
> <table class=""></table>
> </body>
> </html>
> 
> So, "datatable" is removed. Why? I ask because tidy removed content here
> and I'm worried about that. Is there a way to make tidy not do that? 

You have two pieces of information attached to <table>:
 class=""   - a valid attribute, which Tidy has retained;
 datatable"" - which is invalid and has been removed.

If datatable is supposed to be a (non-standard) attribute then it needs 
an equals sign separating it from the empty value string (the ""). That 
is, you should put
 <table class="" datatable="">

If it is supposed to be the value of the class attribute then it needs to 
go inside the quotes.

You can't expect Tidy to guess which of these two you meant.
Received on Monday, 22 November 2010 09:40:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:14:00 GMT