(marks, definitions, terms, abbreviations, and variables <m>, <dfn>, <abbr>, <term>) part of my review of 3.12 Phrase elements

Summary:

* Propose some small editorial changes to the draft.
* Propose new element: <term>
* For use on mostly on <dfn>, but perhaps also on <abbr>, <term>, and  
<var>, propose new attributes: 	scoped (boolean), defref (string;  
required), casesensitive (boolean; referring to the defref value),	 
phonetic (or "pronunciation" string;), initiialism (boolean;  
referring to the defref value), asword (boolean; referring to the  
defref value)

The proposed enhancements are meant to deal with the problem of  
providing a more rigid markup for terms, abbreviations and variables  
used in a document. The proposed enhancements should support auto- 
generation of a document index, a document glossary and interactive  
discovery of term, abbreviation and variable usage within an  
interactive UA.


Highlights, definitions, terms, abbreviations, and variables <m>,  
<dfn>, <abbr>, <term> (part of my review of 3.12 Phrase elements)

Highlight / mark (<m>):
This section looks good. I have no suggestions for improvement.

Definitions of terms, abbreviations and variables (relates to <dfn>,  
<abbr>, <var> and proposed <term>):

I like the more elaborate document conformance criteria the HTML5  
draft provides for <dfn> and <abbr>. However, right now <dfn> either  
encloses text nodes that are the the definition of the term or, in  
the case of an abbreviation, it includes a text node (indirectly)  
that is the term defined (with the @title value the definition of the  
term). There are places in the draft, where the distinction between a  
term and the elaborate definition is not kept completely distinct. I  
think it has been even more difficult for authors to keep these uses  
separate. This confusion over exactly how to use a <dfn> element is  
one of the prime use-case/problem statements for the following  
proposals:

Consider adding  a <term> element type:
I  propose introducing a new <term> element that would act much like  
<abbr> and <var> in relation to <dfn> except for non-abbreviated terms.

I do not think it would it break compatibility to introduce a <term>  
element: though its default styling would only work with an embedded/ 
linked stylesheet of a HTML5 conforming UA with a proper default  
stylesheet (for example "font-weight: bold" might be appropriate).

proposed language/
Authors use the <term> element to contain important terms or key  
words that might be important for an index in a printed version of  
the document or indicate greater importance for a search engine. For  
example, imagine an author who wants to write an essay on the word  
"and" and the ways that word has been used throughout English  
literature. By marking <term>and</term> inside a term element it will  
indicate that this is a word being use in a specialized way an not in  
its more conventional way. Another example, consider the use of a  
term like <term>service</term> where it is being used to refer to a  
daemon running on a host hardware server. The term has a more  
colloquial meaning and the same document may even use the term in  
this colloquial sense of "providing a service to our customers by  
installing our <term>service</term> on their hardware."
/end proposed language

Definition (<dfn>); consider adding:

proposed additions (for <dfn>/

Element-specific attributes:
	scoped (boolean)
	defref (string; required)
	casesensitive (boolean)
	phonetic (string; to provide Unicode string of phonetic
characters for pronunciation hint)
	initiialism (boolean; referring to the defref value)
	asword (boolean; referring to the defref value)

The definition element provides a mechanism to provide a precise and  
elaborate definition for a term (<term>), abbreviation (<abbr>), or  
(<var>). Authors need only use the dfn element for the initial  
instance of the definition. UAs will provide a mechanism to indicate  
the definition of a term throughout the rest o the document. (e.g.,  
by presenting a tooltip or changing the status bar when the pointing  
device hovers over a term within scope)  By including the scoped  
attribute, the scope of the definition is restricted to the current  
sectioning element (implicit? or explicit) only. For example, the  
scoped attribute would facilitate a document containing multiple  
essays by several authors all addressing the meaning of the same term.

Authors must include the defref attribute on the dfn element to  
indicate what term, variable or abbreviation the definition refers  
to. The inclusion of the casesensitive boolean attribute indicates  
that the matching of the defref value with terms, variable and  
abbreviations must be done in a casesensitive manner. Authors may use  
CSS or another styling mechanism to display only the value of the  
@defref attribute rather than the elaborated definition (for  
example., to leave the elaborated definition for an automatically  
generated glossary).

UA conformance: If the casesensitive attribute is false, UAs must  
match the dfn element's defref attribute value in a case-insensitive  
manner. within the scope of the dfn element. If the casesensitive  
attribute is true, UAs must match the dfn element's defref attribute  
value in a case-sensitive manner. within the scope of the dfn  
element. UAs should provide a mechanism to let users quickly view the  
definition for a term, variable or abbreviation while maintaining  
presentation of the current context of the term. (e.g., by presenting  
a tooltip or changing the status bar when the pointing device hovers  
over a term within scope).
/end proposed additions

Abbreviation (<abbr>):
Consider adding:

Element-specific attributes:
	initiialism (boolean)
	asword (boolean)
	phonetic  (string)

The attributes specific to <abbr> are. 1) an @initiialism boolean  
attribute.for pronunciation hint. 2)  an @asword attribute for a  
pronunciation hint to read the abbreviation as a word rather than  
spelling out the letters (what some variants of English call an  
acronym). 3) a @phonetic attribute to contain specific phonetic  
Unicode character or character references as a pronunciation hint  
(Unicode could use some improvement on phonetic characters, but this  
mechanism should work today and be forward compatible even as Unicode  
improved its organization of phonetic characters)[1].

Consider adding the word "title" to the sentence: " If present, the  
__title__ attribute must only contain an expansion of the  
abbreviation." (for greater clarity)

The last example for <abbr> especially underscores the problem where  
the "term" is doubly markedup with both <dfn> and <abbr> and the  
definition has no markup at all except for the surrounding paragraph  
(which is only due to the particularly contrived example; the  
paragraph could easily be longer). Based on my proposal, the  
definition would be marked up with the <dfn> element and the term  
would be marked up with the <abbr> element.

The draft currently reads:

"In the example below, the word "Zat" is used as an abbreviation in  
the second paragraph. The abbreviation is defined in the first, so  
the explanatory title attribute has been omitted. Because of the way  
dfn elements are defined, the second abbr element in this example  
would be connected (in some UA-specific way) to the first.
<p>The <dfn><abbr>Zat</abbr></dfn>, short for Zat'ni'catel, is a  
weapon.</p>
<p>Jack used a <abbr>Zat</abbr> to make the boxes of evidence  
disappear.</p>"

My proposal would change this to:
<p>The <abbr>Zat</abbr>, <dfn defref='Zat'>short for Zat'ni'catel</ 
dfn>, is a weapon.</p>
<p>Jack used a <abbr>Zat</abbr> to make the boxes of evidence  
disappear.</p>"

With this approach the, association is clear and the UA is not left  
to infer what the association is based on a paragraph that may  
contain less or more than the precise definition.

Variables (<var>):

I propose variable be treated much like the propose <term> except it  
would be for the more particular use as a variable denoting or  
pointing to a specific object — however narrowly or broadly defined —  
but specifically for use in mathematics, computer programming or  
formal logic. Here the distinction with <term> is subtle, but I think  
there are cultural reasons to support both elements. After all  
consider the term automobile as a term that denotes the class of "all  
self-contained power-plant vehicles with facilities to transport  
humans and cargo." We could instead think of that as a variable for  
that class. However, I think the use of variable is more specialized  
than that (often denoting a precise instance of a term). So I think  
the distinction is useful to maintain in markup.

So we should supplement our description of variable by discussing  
more modern programming constructs like: "element", "attribute",  
"property", "class", "instance", "object", "math", "nativetype",  
"method", "function", etc.

It might be useful to introduce another element for other modern  
graphical constructs that could be subsumed under <var> or handled  
with another element would be: "view", "control", "cell", "menu",  
"menuitem", "button", etc. But we may want to include those too among  
the constructs that might also be appropriately marked-up with <var>  
as the <var> would be referring to a specific instance of a button or  
control (on the screen for example).

Consider adding a @type attribute to <var> so that authors can  
express the precise kind of instance the variable points to. This  
could be useful for <var> elements that are not associated with a  
<dfn> element for when the variable is used in a more abstract way.  
For contrast consider the example:

<p>Consider a circle with circumference <var type='realnumber'>x</ 
var>.</p> and circumscribe a circle with circumference <var  
type='realnumber'>x</var>. around …</p>

versus

<p>In the following we treat <var>myElement</var> as always equal to  
<dfn>the element returned by getElementByIdNS("myVar")</dfn>..</p>

The first example provides more detail than it could otherwise  
provide without needing to resort to a definition. It could  
alternately be marked up with:

<p>Consider a circle with circumference <var>x</var> <dfn defref='x'  
 >a real number</dfn>.</p> and circumscribe a circle with  
circumference <var type='realnumber'>x</var>, also <dfn defref='y'> a  
real number</dfn>. around …</p>

If HTML could provide a keyword value list for many common types of  
instances used in formal logic, mathematics and computer science  
authors would have a shortcut for expressing those types.

The <var> element makes mention of special meaning to @title when  
used with <dfn> but nothing else is said about that here. An example  
and explanation would be helpful. (is this meant to match the @title  
of the defining dfn element for the variable?). In that case my  
proposal would be to leave the @title alone and require HTML5 c  
conforming UAs to provide the elaborate definition from any <var>  
element by matching the single @defref attribute on the <dfn> element  
with the contents of the <var> element.


[1]: Unicode currently provides little consistent data on the  
phonetic properties of characters.  Perhaps W3C could liaison with  
Unicode over the issue of establishing semantic phonetic Unicode  
characters. Alternatively we could provide a <phonetic> or  
<pronunciation> element to contain SMIL markup. However, Unicode  
could be enhanced to support plain-text phonetics with some minor  
changes.

[2]: For an example of current rendering of many of these elements  
see the attached:
This attachment demonstrates how many of these semantics are not  
necessarily supported through UA default stylesheets. They are  
nevertheless very useful to authors who will tend to author with both  
HTML and CSS in unison.

Received on Friday, 20 July 2007 05:19:50 UTC