Re: XHTML 2.0 - dfn : Content model and usability (PR#7832) from David Woolley on 2006-03-28 (www-html@w3.org from March 2006)

From: David Woolley <david@djwhome.demon.co.uk>
Date: Tue, 28 Mar 2006 08:24:10 +0100 (BST)
To: www-html@w3.org
Message-Id: <200603280724.k2S7OAY00951@djwhome.demon.co.uk>

> > You can't use _any_ of HTML's semantics
> > to unambiguously get data out of the Web in the manner you describe.

That depends.  It is often the case that those sources that are most
presentational have the least real content.  There is an approximate
ordering, of increasingly correct usage:

Vanity sites
Commercial
Governmental (because they outsource to commercial web designers)
Academic PR departments (and alumni offices, etc.)
Charities
Personal sites with non-vanity content.
Academics writing for themselves.

The last category tends to have the most real content and is also most
likely to use structural markup properly.

There is also an invisible category of documents on the intranets of
research based companies.

> The potential has not been used, and there is little reason to think that 
> XHTML 2.0 would change this. And even millions of pages would not help 
> much if that means that only, say, one out of a hundred of defining 
> occurrences of terms has been marked up with <dfn>.

That really depends on whether XHTML 2.0 becomes a must have on people's
CVs.  If it does, it will be grossly abused.  If it doesn't, it is likely
to have a quite high level of correct usage, although a small level of
total usage.

If, as I suspect, you don't really believe in structural HTML at all, could
I suggest that tagged PDF is a much better compromise.

Received on Tuesday, 28 March 2006 07:36:21 UTC