- From: David Woolley <david@djwhome.demon.co.uk>
- Date: Sat, 8 Nov 2003 12:35:34 +0000 (GMT)
- To: www-html@w3.org
This isn't really the place to give tutorials on structured structural markup languages, but... > Your code validates. Good. Then I tried a variation of your code: I placed > everything from <form> to </form> in another table around the entire What your words say would validate, but I don't think that you are being precise in your use of words. As we are talking XHTML, XML rules apply. In XML open and close tags are like left and right parentheses, and, moreover, the tag name in a close tag is completely redundant, so you cannot use it to match up the corresponding open tag. If you, for instance, try <td><form></td>..<td></form></td>, and destroy the information about tag name in the close tags, you get something like: <td><form></??>..<td></??></??> Applying the rule that, like parentheses, tags must balance, this would have to be parsed as: <td><form></form>..<td> ABORT, td is not a valid direct descendant of td. Giving the tag name in the closing tag allows an earlier abort as one knows that the first </??> must be </form> but you've actually found a </td>. The same rules apply in HTML (as against XHTML) but you are allowed to miss out some of the tags (you are actually allowed to miss some opening ones, but I'll only consider closing ones). The ones you are allowed to miss are based on element type and not context, but they are chosen so that it is always possible to know where the tag would have been. That means that some elements must always have a closing tag. When that tag is found, you know it matches against the corresponding opening one and you know that tags must be balanced like parentheses, so you know that any open elements with optional close tags must have those close tags immediately before the explicit closing tag. Unfortunately, some early browsers, particularly Netscape upto and including version 4, seem not to have interpreted HTML this way, but to have turned on bold on <b> and turned it off on </b> regardless as to whether or not they were properly nested. This sort of behaviour is often described as "tag soup" because there is no logic in the placement of the tags. However, CSS and the document object models that underly powerful use of scripting rely on tags being properly nested and modern browsers only work well when that is true. Your basic problem is that you are trying to create two incompatible structures: the logical structure that HTML was designed to describe and an artificial one to produce a gridded layout. The only way of resolving such a conflict is by breaking things up into the smallest unit that is common to both structures, but that would mean a form that only occupies some rows of a table would have to be a form per table cell, but you want the whole form to submit at once. Tagged PDF handles this by making the dominant form the layout, then providing a parallel description (partly embedded, but partly out of line, that allocates parts of the physical layout to places in the logical structure). HTML on the other hand, takes the position that the meaning of the document is what matters, not how it appears, and uses style languages to overlay an appearance on the logical structure. Vendors realised that the likely buyers of web page design products weren't actually interested in logical structures, so HTML became polluted with things to directly control appearence. About five years ago, with HTML 4, an attempt was made to remove these, but has been largely unsuccessful in commercial use. > OK, I guess some of us are a lot less familiar with what the specs specify > versus what the browsers do on the user's end. CSS fixes this, but it seems > strange to have to "fix" something that was never specified. There is only one rule here: HTML does not specify how a document should be presented to the user, only how it's content is structured and the general nature of the meaning. > The designer's only medium of communication is the browser window, which The person responsible for communication aspects should not be concerned about how browsers will display things. It is confusing form with content that causes this sort of problem in the first place. I accept that commercial decisions by browser writers mean that the ideal is difficult to achieve without compromising structure or layout, but in an ideal world, the information content should be written, in HTML, by someone who is only concerned about the information and the graphic designer should not touch the HTML, but work purely in a style language. You work on governmental sites. In the past these have actually maintained good separation: you could usually use them in a text only browser without being aware of limitations. This is breaking down in the UK.
Received on Saturday, 8 November 2003 07:35:38 UTC