Words vs. context (was Re: "ACRONYM")

Holger Wahlen (wahlen@ph-cip.Uni-Koeln.DE)
Tue, 29 Jul 1997 21:05:25 +0200


Date: Tue, 29 Jul 1997 21:05:25 +0200
Message-Id: <199707291905.AA29741@jupiter.ph-cip.Uni-Koeln.DE>
To: www-html@w3.org
From: wahlen@ph-cip.Uni-Koeln.DE (Holger Wahlen)
Subject: Words vs. context (was Re: "ACRONYM")

On Monday, 28 Jul 1997 08:30:21 -0400, Dave Raggett
<dsr@w3.org> wrote:

| Perhaps the HTML 4.0 spec should replace ACRONYM by a new
| attribute on SPAN, e.g.
| 
|   The <span spellout>BBC</span> tonight reported heavy
|   shelling on the Boznian capital.

It has been argued afterwards that this is something
presentational and hence more suitable for CSS, so that it
would be more appropriate to deal with this in the way
"CLASS=spellout" instead - okay. The question I'd like to
bring up is a different one, namely whether it should be SPAN
at all here that should get such an attribute. Maybe I'm just
making things too complicated, but I'd suggest to distinguish
between information about the words of a document seen within
that document and such about the words, well, just as words.

Two examples.

(1) Look at a sentence like this, first with physical markup:
	That <I>really</I> must be taken
        <I>cum grano salis</I>.
The purpose of the italics in the first case is simply
emphasis, so this becomes EM in logical markup. At the end
they are used to indicate a foreign phrase - HTML doesn't
have a special element for that, so SPAN CLASS=foreign or
something like that would be appropriate.

(2) Let me modify the example the 4.0 draft uses in the
section about DIV and SPAN: Imagine a database with only two
fields, people's last name and their homepage URL. To make a
document of that, DIV and SPAN can be used like this:
	<DIV ID=client-boyera CLASS=client>
	<SPAN CLASS=client-last-name>Last name:</SPAN>
	Boyera,
	<SPAN CLASS=client-url>Homepage URL:</SPAN>
	http://foo.com/~boyera/
	</DIV>
and so on. Now there's the abbreviation "URL", and I'd like
to indicate this is pronounced by spelling it out, so that I
help speech browsers that don't know this already - after
all, "URL" is a rather uncommon expression on the Web. ;-)
That could look like this:
	<SPAN CLASS=client-url>Homepage
	<SPAN CLASS=spellout>URL</SPAN>:
	</SPAN>
	http://foo.com/~boyera/

Now, in my view there are two basic kinds of element contents
involved in these examples: "Really" in (1) and "homepage
URL" in (2) only get that markup in these specific documents,
but "cum grano salis" is *always* a foreign phrase, and "URL"
is *always* an abbreviation of the `spell-out' kind, no
matter what context they appear in - this is information
about the words themselves. Wouldn't it be more logical to
have a separate element for that, and only to use SPAN to
convey information that just makes sense in the given
context? I don't have a good name idea yet, but if we used
"ET" (from "etymology"), for instance, things could look this
way:
	<ET CLASS=foreign LANG=la>cum grano salis</ET>,
	<ET CLASS=spellout>URL</ET>.

To continue this train of thought: There have been ideas for
an element like PERSON - that would be another example of
rather telling something about a word itself than about its
function in a certain document, wouldn't it? So maybe this
wouldn't have to be implemented as an element of its own, but
a construction like <ET CLASS=person> might do already. Or
think about
	<ET CLASS=place>Washington <ET CLASS=spellout>DC</ET></ET> ...

Comments?

____  |__|   / Holger   //       mailto:wahlen@ph-cip.uni-koeln.de  ____
      |  |/|/  Wahlen  //  http://www.ph-cip.uni-koeln.de/~wahlen/