- From: Martian <abigail@mars.ic.iaf.nl>
- Date: Sun, 9 Apr 1995 02:49:42 +0200 (MET DST)
- To: hurleyj@arachnaut.org (Jim Hurley)
- Cc: www-html@www10.w3.org
Once upon a time you, Jim Hurley, wrote: --> I wrote: --> >And according to the DTD: --> > --> ><!ELEMENT HEAD O O (%head.content)> --> ><!ELEMENT BODY O O %body.content> --> > --> ><!ENTITY % html.content "HEAD, BODY+"> --> > --> >The first O indicates the opening tag is optional, the second one --> >indicates the closing tag is optional. --> --> Sorry. --> --> >Every HTML document must have a head, and I did not say it should not. --> >All I said is that the <head>, </head> *tags* do not have to be --> >present, as confirmed by the DTD. (Similar for the <body>, </body> --> >tags.) Apparently, HTML parsers are smart enough to decided for --> >themselves what is the head and what is the body. --> > --> >--> >Returning an error if it encounters EOF before </head> would be a --> >--> >major design bug. --> >--> --> >--> A major design bug of the HTML document, yes - but these are so --> >--> commonly encountered. --> > --> >Nope, just like </p>, </li>, etc some tags are not required. --> > --> > --> >Abigail --> --> But this last part was about encountering a <head> but not getting --> a matching </head>. Are you saying the <head> is terminated by --> <body> or some body part? All I said is that all the tags <head>, </head>, <body> and </body> are optional. One could have a document with just the </head> tag, or only <head> and </body>. It is all legal according to the DTD. So, if you want to grap the head *section* (not the <head> *tag*) you would have to be a little smarter. However, since there are only a few tags part of the head section, it is not difficult. Whenever you encounter anything which is not enclosed by any of the valid head section tags (like <title>,</title>) you have reached the body part. However, the question was originally raised asking a way to get only the head of a document. This means the server has to parse the document itself, which makes servers more complex, and more importantly, slower. Abigail
Received on Saturday, 8 April 1995 21:51:27 UTC