RE: More Questions

Thank you for your quick answer.
I open the document with Internet Explorer
Open the jpg to see what is displayed on the screen.
Regards

-----Original Message-----
From: Bjoern Hoehrmann [mailto:derhoermi@gmx.net] 
Sent: 03 September, 2007 7:17 PM
To: VILLEPREUX Charles, PAC/PS
Cc: html-tidy@w3.org; CISSOKHO MUTTER Pascale, PAC/PM; DESMARTIN Marion,
PAC/PROD; CROWTHER Joseph, PAC/PS
Subject: Re: More Questions


* <Charles.VILLEPREUX@oecd.org> wrote:
>Using "HTML Tidy", I would like to have explanation concerning:
>
>1°) Why "HTML Tidy" does not manage properly the <sub> element within 
>the <title> element ?
>
>HTML:
><title>SourceOECD: Factbook 2007 - Emissions of carbon dioxide
>(CO<sub>2</sub>)</title>
>
>XHTML:
>
><title>SourceOECD: Factbook 2007 - Emissions of carbon dioxide 
>(CO</title> </head>
><body>
><sub>2</sub>)  

You cannot use elements in the <title> element, you have to use plain text.
For cases like <sub>2</sub> Unicode has characters that can be used (subject
to limitations of the browser, window manager, fonts); in this case, you
might be able to use

  <title>SourceOECD: Factbook 2007 - Emissions
         of carbon dioxide (CO&#x2082;)</title>

Where &#x2082; refers to the character U+2082 SUBSCRIPT TWO.

>2°)  Why "HTML Tidy" does not convert &bull; or &#149 character ?
>
>My configuration for the encoding process: Input-encoding:ascii ; 
>Output-encoding:utf8 When I open the XHTML page, the bullet does not 
>appear.

The character reference &#149; should not be used in HTML documents, it
relies on a Microsoft Windows-specific mapping; &bull; would be correct, if
it does not show up properly, you might have a problem with the character
encoding declaration or fonts. Where do you open the document, and what is
shown in its place?

>> 1°) required attribute "alt" not specified for the "img" element
>> 	Q1: Why "HTML Tidy" do not create automatically required attributes 
>> ?

The contents of the alt attribute should replace the image if the user did
not download the image, or cannot access it for other reasons (e.g., because
they are blind). Tidy cannot know what an appropriate alternate text would
be, and so does not add it. There is an alt-text option to fill in a default,
but its use is not re- commended.

>> 2°) required attribute "action" not specified for the "form" element
>> 	
>> 			HTML:
>> 				<form name=switchit><input type=hidden
>> value=0 name=switchedselector></form>
>> 			
>> 			XHTML:
>> 				<form name="switchit" id="switchit"><input
>> type="hidden" value="0" name="switchedselector" /></form>
>> 	
>> 	Q2: Why does HTML Tidy create the attribute id="switchit" ?

Because in XHTML the name attribute on the form element might not be
recognized as identifying the form. The XHTML specification re- commends to
use both the name and id attribute for this reason.

>> 3°) ID X already defined
>> 
>> 			HTML:
>> 				<br />
>> 				<br><br></p><p></p>
>> 
>> 				</span></td>
>> 				</tr>
>> 
>> 			XHTML:
>> 				<br />
>> 				<br />
>> 				<br />
>> 				<br />
>> 				<br /></span>
>> 				<p><span id="06" style="display:
>> none;"></span></p>
>> 				</td>
>> 				</tr>
>> 
>> 
>> 	Q3: Why does HTML Tidy create a span element with an id (06) which 
>> already exists ?

This would depend on the outer structure of the document; generally it would
do so in order to fix some other error in the document.

>> 4) value of attribute "id" invalid: "0" cannot start a name (For 
>> example, id and name attributes must begin with a letter, not a 
>> digit)
>> 
>> 			HTML
>> 				<img src="plusminusimages/01plus.gif"
>> border="0" id="01_image" onclick="javascript:changePlusMinus('01');"
>> style="cursor:pointer;cursor:hand" name="01_image" />
>> 				...
>> 				<span id="01" style="display: none ...
>> 
>> 	Q4: Do you think it is more a warning than an error ? (It does not 
>> seem to provoke any problem when browsing the XHTML file ...)

Browsers might tolerate the error, but the HTML and XHTML specifications
prohibit the use of IDs that start with a number. Generally, the work- around
is to use a short prefix like id="x01_image".

>> 5°) there is no attribute "width" for the "div" element
>> 
>> 
>> 			HTML:
>> 				<div id="showhideText" width="100"
>> class="normal">Show all indicators</div>
>> 			XHTML:
>> 				<div id="showhideText" width="100"
>> class="normal">Show all indicators</div>
>> 
>> 
>> 	Q5: Why does "HTML Tidy" keep the "width" attribute ? Why not delete 
>> it ?

Tidy does not know how to handle it; in some cases it might make sense to
remove it, in other cases it might be best to keep it; you can re- move such
attributes using --drop-proprietary-attributes, if I remember the name
correctly; see the manual on http://tidy.sf.net/ for details.

>> Afterwards I have checked with Internet Explorer browser the 
>> difference between the HTML and the XHTML
>> I do not understand these following transformations done by "HTML Tidy":
>> 
>> I) - A link does not work anymore.
>> 	
>> 			HTML:
>> 				<a 
>> href="javascript:openAll('01','02','03','04','05','06','07','08','09'
>> ,'10',
>> '11','12')" class="Text"><div id="showhideText" width="100"
>> class="normal">Show all indicators</div></a>
>> 			
>> 			XHTML:
>> 				<a 
>> href="javascript:openAll('01','02','03','04','05','06','07','08','09'
>> ,'10',
>> '11','12')" class="Text"></a>
>> 				<div id="showhideText" width="100"
>> class="normal">Show all indicators</div>
>> 
>> 	Q6: Why does "HTML Tidy" change the imbrication of tags ?

The HTML and XHTML specifications do not allow nesting <div> inside the <a>
element. You have to use some other element (e.g. <span>) and use CSS if
block formatting is required (e.g., <span style='display:block'> in place of
the <div ...>).

>> II) - More line breaks.
>> 
>> 			HTML:
>> 				&bull; <a class='Text'
>> href='01-02-02.htm'>Elderly population by region</a><br />
>> 				<br><br></p>
>> 
>> 				</span></td>
>> 				</tr>
>> 
>> 			XHTML:
>> 				&#8226; <a class='Text'
>> href='01-02-02.htm'>Elderly population by
>> 				region</a><br />
>> 				<br />
>> 				<br />
>> 				<br />
>> 				<br /></span></td>
>> 				</tr>
>> 
>> 	Q7: Why does "HTML Tidy" generate additional <br /> elements ?

My guess here is that there is no opening <p> that matches the closing </p>,
so Tidy turns the latter into <p></p> and then replaces it with two <br>
elements.

I hope this help,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Tuesday, 4 September 2007 07:27:29 UTC