- From: Didier PH Martin <martind@netfolder.com>
- Date: Sat, 17 Aug 2002 09:02:30 -0400
- To: "'Elliotte Rusty Harold'" <elharo@metalab.unc.edu>, <www-tag@w3.org>
- Cc: <www-style@w3.org>, <w3c-css-wg@w3.org>
Hi Eliotte, Eliotte said: As it happens, when I read this message I had the Mercury News open in a browser window, so I looked at how they expressed headlines in HTML: <a href="http://ae.bayarea.com/entertainment/ui/bayarea/restaurant.html?id= 60966&reviewId=11135" class="headline">Pizza heaven</a> ..... other examples..... This is even easier to reproduce in XSL-FO: <fo:inline font-face="Times New Roman,Times,Serif" font-size="120%" font-weight="bold">Baseball Players' Union Sets Strike Date for Aug. 30</fo:inline> Didier replies: What this is demonstrating is that HTML documents out there, in the real world, are simply rendering documents and they provide very little semantics information. I guess this is on purpose since these content provider want to preserve their copyright. The harder they make their content hard to be processed, the more they feel protected from free riders. Simple business common sense and as you know business as practiced today is not altruistic ;-) Back in 1995, people started to use tables and other HTML features as layout instructions. This behavior is probably induced by the implicit visual rendering model the browsers possess. However, it nonetheless possible to state that a header is specified with a <H1> element and use CSS to attach to it a visual rendition provides a property set. This practice would preserve some semantics and would separate content from presentation (at least in parts). But, as you demonstrated in your examples, web designers show incredible creativity in their usage of HTML elements used mainly as visual rendering objects. Nonetheless, HTML per se is not explicitly specified as a rendering language. SVG is, VoiceXML is, etc... As we say that the web is based on an underlying architecture (i.e. REST) even if a lot of people are not seeing the same reality nor are designing their sites based on these principles, we can also say that HTML is not a rendering language and that, if is used as such, it is because of the behavior of certain HTML interpreters named browsers. Other agents like, for instance, classification engines would prefer the document to contain more semantic information. The visual rendition characteristic are not specified in the HTML specs, they are part of certain concrete HTML interpreter, user agent, browsers. Its an interpretation of HTML, not a usage based on the specs. Eliotte said: XSL-FO contains all the aural properties of CSS. It is no more limited to visual presentation than HTML is (which is to say, in practice, it's quite tied to visual layout). I understand the theoretical point that HTML does not have any official layout model, unlike XSL-FO and SVG. However, the implicit layout model enforced by Web browsers is so strong that it renders the point moot. HTML is a layout language, a less powerful one than XSL-FO to be sure, but still a layout language. DocBook it is not. Didier replies: Per usage yes, per design no. So, from an anthropological or social point of view, you are right, a vast majority people are using HTML as a rendering language. Was that intended to be that? It is not explicitly stated in the specs so we can reasonably infer that it wasn't. You also have to take into consideration that a tiny minority is using HTML as document semantics, maybe limited according to some judgments but it is still a valid document model with paragraphs, headers, etc... Obviously a lot of entities not related to content have been added but you can stick to the basic constructs and DTDs exists to help you do so. Maybe you should speak of HTML not as a single object but more as a language having several different dialects. It all depends on the dialect you are referring to. Eliotte said: In 2002 anyone who thinks an H1 element really means anything other than "Make this a big, bold, block level element" is kidding themselves. The L in HTML stands for "Language". HTML evolves as all languages do. The meaning of its words is defined by its speakers. HTML has escaped the ivory tower of semantics, and been vulgarized as successful languages always are. The prescriptions of the W3C have about as much affect on HTML as the prescriptions of the Académie Française have on French (that is, little to none). Didier replies: >From the social and anthropological point of views you are totally right, especially if you are referring to the main stream web content designers. For them an HTML document is simply a document's visual (and sometimes aural) layout. This also gives us a good clue of the reasons why we do not see yet a semantic web ;-) Cheers Didier PH Martin
Received on Saturday, 17 August 2002 09:02:56 UTC