- From: David Bryant <davidbryant@att.net>
- Date: Mon, 03 Mar 2003 11:25:18 -0700
- To: www-validator@w3.org
Summary: The string </ is always interpreted as the start of an HTML end tag, even when it's inside a scripted string constant. ----------------------------------------------------------- Hi! I'm new to this list. I live in Denver, Colorado. I've been using the markup validation service extensively to check my HTML coding. You can see my pages at http://davidbryant.home.att.net if you want to. Every page on my site starts with <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> Anyway, I was recently checking a page entitled "prog105.htm" which you can locate on my site, and I got a number of errors back that really should not have been errors. This page displays inside a frameset. I'm using the bottom 16% of the screen as a display area for definitions, and I use JavaScript with the document.write method to display the requested definition for certain words in that bottom frame. The information to be displayed is (was) encoded like this: <script ...> ... some code omitted ... var dfnhead = '<font size="2" face="Arial,Helvetica,sans-serif"><b>'; var dfntail = '<br></font>' var dfn = new Array() // definitions; dfn[0] = "Assembly Language:</b> A computer language consisting of <i>mnemonics</i> that translate directly into individual machine instructions, <i>macro instructions</i> that generate one or many individual assembly language statements, and <i>assembler directives</i> that control the assembly process."; ... (more array entries) ... and the logic to display the requested definition looks like this: top.frames[1].write(dfnhead + dfn[x] + dfntail); ... more code omitted ... </script> where x is a variable passed at the time this particular function is invoked (when the user clicks on a link). So anyway the page was displaying beautifully, but it would not validate on your service because the parser identified the </b> and </i> strings inside the JavaScript code quoted above as being unmatched HTML end tags. But it didn't complain about the <font ...> or the </font> tags embedded inside the variables dfnhead and dfntail. And it apparently didn't even see any of the <i> tags, because all of those are paired up perfectly with the </i> tags in the definitions the way I coded them, and it still said all the </i> tags were unmatched. I don't need advice on how to fix this. I already fooled the parser by breaking up my simple long strings into several shorter strings and then concatenating them later with my JavaScript code. Now the page validates OK. But I still think the parser contains a bug. Why does the parser look at stuff in between <script> and </script> tags at all? There isn't going to be any real HTML in between those two tags, anyway. And if there is some HTML embedded inside scripted code, your parser would have to emulate the scripting interpreter and actually generate all the possible strings of output that eventually become HTML when a user interacts with the script before it could perform a real validation check. That sounds like a very tall order, especially considering that there are several different scripting languages in use today. Just curious. Thanks for your time! dcb
Received on Monday, 3 March 2003 13:50:00 UTC