W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2007

Questions

From: <Charles.VILLEPREUX@oecd.org>
Date: Fri, 3 Aug 2007 10:07:32 +0200
Message-ID: <AB93230B9C912540A4CCE30839864B2607C4399C@EXCHANGEB.main.oecd.org>
To: <html-tidy@w3.org>
Cc: <Pascale.CISSOKHO-MUTTER@oecd.org>, <Marion.DESMARTIN@oecd.org>
Hi,

I am currently working at the OECD in Paris and for my current project, I
need to convert HTML files to XHTML files.
I have found on the Internet that "HTML Tidy" is a good tool to do it. It was
originally written by Dave Raggett of the World Wide Web Consortium (W3C).
The software is now maintained by a group of volunteers working as an Open
Source Community at Source Forge.

So I have downloaded the EXE Version from 24 jul 2007:
http://www.paehl.com/open_source/?HTML_Tidy_for_Windows
I have created a configuration file.
Command line: tidy -config config.txt index_HTML 

The log file "err.txt" only indicates warnings.
But using http://validator.w3.org/ tool, I have discovered some errors like:

1) required attribute "alt" not specified for the "img" element
	Q1: Why "HTML Tidy" do not create automatically required attributes ?

2) required attribute "action" not specified for the "form" element
	
			HTML:
				<form name=switchit><input type=hidden
value=0 name=switchedselector></form>
			
			XHTML:
				<form name="switchit" id="switchit"><input
type="hidden" value="0" name="switchedselector" /></form>
	
	Q2: Why does HTML Tidy create the attribute id="switchit" ?

3) ID X already defined 

				HTML:
					<br />
					<br><br></p><p></p>

					</span></td>
					</tr>

				XHTML:
					<br />
					<br />
					<br />
					<br />
					<br /></span>
					<p><span id="06" style="display:
none;"></span></p>
					</td>
					</tr>


	Q3: Why does HTML Tidy create a span element with an id (06) which
already exists ?

4) value of attribute "id" invalid: "0" cannot start a name (For example, id
and name attributes must begin with a letter, not a digit)

				HTML
					<img src="plusminusimages/01plus.gif"
border="0" id="01_image" onclick="javascript:changePlusMinus('01');"
style="cursor:pointer;cursor:hand" name="01_image" />
					...
					<span id="01" style="display: none
....

	Q4: Do you think it is more a warning than an error ? (It does not
seem to provoke any problem when browsing the XHTML file ...)


5) there is no attribute "width" for the "div" element


				HTML:
					<div id="showhideText" width="100"
class="normal">Show all indicators</div>
				XHTML:
					<div id="showhideText" width="100"
class="normal">Show all indicators</div>


	Q5: Why does "HTML Tidy" keep the "width" attribute ? Why not delete
it ?


Afterwards I have checked with Internet Explorer browser the difference
between the HTML and the XHTML  
I do not understand these following transformations done by "HTML Tidy":

I) - A link does not work anymore.
	
			HTML:
				<a
href="javascript:openAll('01','02','03','04','05','06','07','08','09','10','1
1','12')" class="Text"><div id="showhideText" width="100" class="normal">Show
all indicators</div></a>
			
			XHTML:
				<a
href="javascript:openAll('01','02','03','04','05','06','07','08','09','10','1
1','12')" class="Text"></a>
				<div id="showhideText" width="100"
class="normal">Show all indicators</div>

	Q6: Why does "HTML Tidy" change the imbrication of tags ?


II) - More line breaks.

				HTML:
					&bull; <a class='Text'
href='01-02-02.htm'>Elderly population by region</a><br />
					<br><br></p>

					</span></td>
					</tr>

				XHTML:
					&#8226; <a class='Text'
href='01-02-02.htm'>Elderly population by
					region</a><br />
					<br />
					<br />
					<br />
					<br /></span></td>
					</tr>

	Q7: Why does "HTML Tidy" generate additional <br /> elements ?



See attached HTML, XHTML, log and configuration files:


 <<index_HTML.htm>>  <<index_XHTML.htm>> 

 <<err.txt>>  <<config.txt>> 
Thank you a lot for your help. May be some parameters have to be added in the
configuration file ....

Best regards,


Charles VILLEPREUX
Technical Assistant
R&D 
OECD, PAC
Tel: +33 (0)1 49 10 43 66
charles.villepreux@oecd.org

HTML Tidy is a tool that was originally written by Dave Raggett of the World
Wide Web Consortium (W3C). It is designed to fix mistakes in HTML, tidy up
the layout (hence the name), assist with web accessibility, convert HTML to
XHTML and many other things.
The software is now maintained by a group of volunteers working as an Open
Source Community at Source Forge and this is the place to go for more
information.
HTML Tidy Documentation:
http://tidy.sourceforge.net/





Received on Friday, 3 August 2007 11:45:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT