Invalid HTML inputs

Hi all,
    I've noticed that a couple of the HTML inputs from the test suit are 
invalid. Their title elements are either in the wrong place, 
withintext3html.html, or missing completely, domain3html.html. This 
causes my parser to tidy the title elements and insert them were 
missing,(which causes my outputs to be unconformant):
*
withintext3html.html*:
/html    withinText="no"
/html/head[1]    withinText="no"
/html/head[1]/meta[1]    withinText="no"
/html/head[1]/meta[1]/@charset
/html/head[1]/link[1]    withinText="no"
/html/head[1]/link[1]/@href
/html/head[1]/link[1]/@rel
*/html/head[1]/title[1]    withinText="no"*
/html/body[1]    withinText="no"
/html/body[1]/section[1]    withinText="no"
/html/body[1]/section[1]/span[1]    withinText="no"
/html/body[1]/section[1]/span[1]/@itemref
/html/body[1]/section[1]/span[2]    withinText="no"
/html/body[1]/section[1]/span[2]/@itemref
/html/body[1]/p[1]    withinText="no"
/html/body[1]/p[1]/cite[1]    withinText="nested"
/html/body[1]/p[1]/span[1]    withinText="yes"
/html/body[1]/p[1]/span[1]/@itemref
/html/body[1]/p[2]    withinText="no"
/html/body[1]/p[2]/img[1]    withinText="yes"
/html/body[1]/p[2]/img[1]/@alt
/html/body[1]/p[2]/img[1]/@src
/html/body[1]/p[3]    withinText="no"
/html/body[1]/p[3]/em[1]    withinText="yes"

*domain3html.html
*/html
/html/head[1]
/html/head[1]/meta[1]
/html/head[1]/meta[1]/@charset
/html/head[1]/link[1]
/html/head[1]/link[1]/@href
/html/head[1]/link[1]/@rel
/html/head[1]/meta[2]
/html/head[1]/meta[2]/@content
/html/head[1]/meta[2]/@name
*/html/head[1]/title[1]*
/html/body[1]    domains="law"
/html/body[1]/p[1]    domains="law"
/html/body[1]/p[2]    domains="law"

The spec for html:
http://www.w3.org/html/wg/drafts/html/master/single-page.html#the-title-element

http://www.w3.org/html/wg/drafts/html/master/single-page.html#the-head-element

from spec:

Note:The |title 
<http://www.w3.org/html/wg/drafts/html/master/single-page.html#the-title-element>| 
element is a required child in most situations, but when a higher-level 
protocol provides title information, e.g. in the Subject line of an 
e-mail when HTML is used as an e-mail authoring format, the |title 
<http://www.w3.org/html/wg/drafts/html/master/single-page.html#the-title-element>| 
element can be omitted.


example

!DOCTYPE HTML>
<HTML>
  <HEAD>
   <META CHARSET="UTF-8">
   <BASE HREF="http://www.example.com/">
   <TITLE>An application with a long head</TITLE>
   <LINK REL="STYLESHEET" HREF="default.css">
   <LINK REL="STYLESHEET ALTERNATE" HREF="big.css" TITLE="Big Text">
   <SCRIPT SRC="support.js"></SCRIPT>
   <META NAME="APPLICATION-NAME" CONTENT="Long headed application">
  </HEAD>
  <BODY>
...

withintext3html.html, domain3html.html are the only invalid files I've come across so far, I'm on domain at the moment, but there could be more of them.
Thanks,
Philip

Received on Friday, 8 March 2013 11:42:52 UTC