- From: David Woolley <david@djwhome.demon.co.uk>
- Date: Sat, 1 Nov 2003 09:34:41 +0000 (GMT)
- To: www-html@w3.org
> What are the central goals behind the development of XHTML 2? How is I believe the key goal is the creation of a tool for the "semantic web". It has to eliminate presentational features and provide a simple set of core functions (not usurp more sophisticated document formats). It has to describe the nature of its contents not the presentation. > it intended to differ from conforming XML + CSS? For example, is it > the intention that XHTML 2 user agents will have default knowledge > of its vocabulary so that they can provide default rendering in the > absence of CSS? You are making the fundamental mistake of assuming that the purpose of user agents is to blindly render a document for a human to *view*. Scooter and the Google equivalent are user agents but they do not render. With high quality documents, other user agents could analyse the full text of web pages to extract information useful to their particular users. A lot of work has been done on intelligent agents by organisations like BT's Martlesham Labs. Also, it just doesn't happen that authors provide CSS for all media. Also, semantics free XML plus CSS only makes sense if the author has total control of presentation, but having semantics rich XHTML allows the reader to control the presentation to make all sources of data consistent, easing their task. It also allows a re-publisher of syndicated content to impose their own house style on the material whilst leaving editorial control with the originator. If you take a longer term, science fiction certainly at least at the moment, view, documents may not be rendered in any tangible way, but injected straight into the recipient's brain. There seem to be three main ways of using HTML: 1) As an advertising copy language (for which I think a page description language would better fit the designers' wants as they typically want total control) - such pages often have little informatiion content; 2) As a language for writing thin client data entry and database applications (when sold as third party products, or used on the public internet, these often have a significant element of item (1)); 3) As a language for describing knowledge that is sufficiently weakly structured that it is dominated by plain (or at least technical jargon) language, but, nonetheless, contains much real information. I don't believe that XHTML 2.0 has any pretensions about fulfilling niche number (1). I haven't looked far enough to see if it is attempting to address item (2). Item (3) tends not to be strongly obvious on the public internet, because real knowledge is valuable intellectual property[A], but there are some organisations that are heavy users of such documentation. Large engineering based companies (but not software systems house, who tend to be more in the business of selling people and fuzzy feelings than knowledge) and pharmaceutical companies are examples. ICI were a very early example, who used to have a leading free text search engine decades pre-web, as the result of developing it for an in house need. Academics also need it. Most of the companies that the public deal with are not knowledge based, but, in computing terms, data based; their data is highly structured so they are more likely to be interested in thin client type uses. Their applications don't fit well with a hypertext model. It was, however, an early, if largely unfilled, promise that the web would give the general public access to the world's knowledge wealth and the ability to contribute to it. Historically, the web was created for use (3). However withing most companies, it was not the research and engineering but the marketing departments that became the big users. Subsequently there has been a move into area (2), but to reduce wealth loss, rather than wealth creation. Netscape particularly addressed area (1) and Microsoft have always addressed area (2). Whilst most businesses that make their money supporting the web are in these areas, I think that content provision will become more important as these areas saturate. If there is a support industry for area (3), it's skills are going to be in librarianship, not graphics design or programming. From a visual rendering point of view, I imagine, especially if you have a CSS engine, supporting HTML use (3) will more or less come free for a browser. > Have such goals been publicly stated in a W3C document? It's interesting to note who asks this sort of question. [A] and where it is available the ability to machine process it is often considered an undesirable property and outlawed by the site's terms of use because the business model is based on the site being read by humans, who also see the advertising.
Received on Saturday, 1 November 2003 05:19:05 UTC