Re: XHTML 2.0 - dfn : Content model and usability

On Tue, 5 Jul 2005, Karl Dubost wrote:

> We had a discussion on a French Web developer mailing-list [pompeurs][1] 
> about dfn. The first comment was about the understanding of the definition in 
> the specification. The second comment was about usability and to know if it 
> was very useful.

The dfn element in the XHTML 2.0 draft is essentially the same as in
HTML 4 (and in HTML 3.2). It has been used very rarely.

Markup for definitions would be very useful in principle, though the 
usefulness depends on authors' willingness to use such markup and on 
different programs' capabilities of utilizing it. (Obviously, this is a 
chicken and egg problem: few people use markup that gives no practical 
benefit, and few software designers write code to deal with markup that is 
used by eccentric authors only.)

The dfn element, as well as the dl element, is a dead end, however.
It has no future even if enhanced with some extra features.

The real question is whether a markup language should have markup for 
definitions. If the answer is yes, then the markup should be rather 
elaborate and cover the fundamentally different ways of giving 
definitions. If the answer is no, it is pointless to introduce failures 
like dfn into something that is meant to be a new generation of HTML.

At the very minimum, useful markup should indicate what constitutes a 
definition, and normally it would contain an element that indicates the 
defined term(s).*) The dfn element only deals with the latter. I've 
written a more detailed analysis of definitions and their markup:
http://www.cs.tut.fi/~jkorpela/def.html

*) It's possible that a definition does not contain a term to be defined. 
A simple example is a two-column table, with one column containing terms, 
another column containing their definitions.

> The dfn element contains the defining instance of the enclosed term.

This is somewhat obscure; what it probably means is that the dfn element 
indicates its content as an occurrence of a term in its definition - 
_without_ indicating what constitutes the definition.

> * Example
>
> An <dfn id="def-acronym">acronym</dfn> is a word formed
> from the initial letters or groups of letters of words in a set
> phrase or series of words.

This is an unfortunate choice of an example: it's a debatable definition 
and therefore draws attention to the dispute, rather than the issue at 
hand: the structure of a definition.

> Maybe the first sentence should be something like:
>
> The dfn element contains a word (or a group of words) being defined by one or 
> more sentences.

My formulation above would express the same idea more generally. For 
example, a definition need not contain any sentences; it could consist of 
an equation or a reference (link).

> It may be good to give usability examples of this element. Why is it useful 
> to use this element?

Well, it basically isn't. As you present, many of the potential benefits 
can be achieved even by using span markup - which might be _better_ since 
it implies no default effect on rendering. (It might, or it might not, be 
useful to make a browser display terms in a particular prominent manner.
Using dfn at present, we know that some browsers do so, and we may try to 
take some action against it, though we would not really know what to fight 
against.)

> Is dfn useful for a machine, a semantics analyzer agent or just a tool to 
> create a list of definition, a glossary from one or a series of page. If we 
> take the example given in XHTML 2.0 right now, I would be inclined to say no.

On the practical side, I tend to agree. But a browser _could_ conceivably 
construct a table of defined terms and make it accessible to the user, 
e.g. making each entry a link to the occurrence of the term in a dfn 
element. And sometimes it is useful to know just what terms are defined in 
a document. A search engine might conceivably be interested in such 
matters too; especially when using a common word as a search term, it is 
frustrating to get zillions of hits when you are really interested in 
knowing how different documents _define_ the word. (And sometimes it is 
useful to _exclude_ pages that contain the word in definitions only.)

All of this would have much greater potential use if _definitions_ were 
marked up, with an element of their own that browsers, search engines, 
etc., could recognize.

> I propose either
> 	- to drop it from the specification
> 	- to add an element making possible to use it for automatic purpose.

Dropping dfn is surely better than keeping it as it is. Replacing it by 
some useful markup for definitions would be interesting, but would it be 
practically useful? Could someone convince Google to pay attention to such 
markup e.g. in its search for definitions (which now probably uses some 
mixed heuristics when you search for define:foo - the description at
http://www.google.com/help/operators.html is probably intentionally vague, 
and probably the underlying assumption is that everyone uses English, so 
that the heuristics can play with words like "define" or "defined").

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Tuesday, 5 July 2005 16:40:19 UTC