Re: Marking elements as 'volatile'

From: David Woolley <david@djwhome.demon.co.uk>
Date: Sat, 15 Jan 2005 10:46:55 +0000 (GMT)
Some more thoughts.

> Having a separate element, such as `<content>`, would make it quite a 
> bit easier for parsers to get the actual content of the page. (The name 

Such processing of pages is often a breach of the terms of use of
major commercial portal sites (e.g. IMDB).  At the moment, one has to
write customised pre-processors to do this stripping, so relatively
few people do it and the breach is clear.  If there were a way for
mainstream browsers to strip out all the branding and advertising,
I really can't see the site operators cooperating with it, as these
aspects of the site are often more important than the content, as it
would become unenforcable to stop people turning on this feature.

In my experience, sites whose prime aim is the provision of information
(and have not been designed by designers more used to designing for
selling) don't have extensive noise in their markup, so they wouldn't
benefit either.

For many commercial sites, the branding noise is often more important
than the real content.

For search engines, one can use the User-Agent string to provide them
with content only, already.  However, Google, at least considers this
an abuse as they want to index exactly what the normal user would see.
Logically marking content as interesting to the search engine is another
form of distortion of what the user sees, even if the content is still
available to ordinary users.
