- From: Robert Burns <rob@robburns.com>
- Date: Mon, 23 Jul 2007 11:24:18 -0500
- To: public-html WG <public-html@w3.org>
- Message-Id: <F72BF1FA-BC13-4484-BD51-A4AA3250F728@robburns.com>
HIGHLIGHTS/MARK, DEFINITIONS, TERMS,ABBREVIATIONS, AND VARIABLES:
M, DFN, TERM, ABBR (part of my review of 3.12 Phrase elements)
Preface
----------------------------------
My original review of these subsections included some mistakes in my
reading of the current draft (along with forgetting how DFN was
specified in HTML4). I encourage everyone to ignore that original
review and to focus on this one instead. The ensuing dialog has been
useful in reshaping this. Since, my new version calls for some even
more sweeping changes than my last review, I'll do two things in this
review. First, I'll discuss eidtorial changes to improve the draft
for the existing approach with these elements. Second, I will
elaborate how I think the introduction of some new elements in favor
of the old elements can improve the situation markedly.
Summary:
----------------------------------
* Propose some editorial changes to the draft to improve the
exposition for the current proposals.
* Propose alternate approach with new elements: TERM, and DEFINE
* For use on mostly on DEFINE, but also on DFN, TERM, ABBR and VAR
propose new attributes:
Matching attributes
• @term (string),
• @abbr (string),
• @scope (xpath),
• @casesensitive (boolean),
• @variantOf (string; used on terms,
abbreviations, and variables for
matching),
Phonetic attributes
• @aphonetic (string),
• @tphonetic (string),
• @phonetic (string),
• @asword (boolean),
• @spelt (boolean)
The proposed enhancements are meant to deal with the problem of
providing a more rigorous and consistent markup for terms,
abbreviations and variables used in a document. The proposed
enhancements should support auto-generation of a document index, a
document glossary and interactive discovery of terms, abbreviations
and variables used within an interactive UA.
Highlight /Mark (M):
----------------------------------
This section looks good. I have no suggestions for improvement.
Defining terms, abbreviations and
variable (DFN, ABBR, VAR) for the current draft:
----------------------------------
Consider changing the current draft for DFN to:
proposed new language/
Defining term:
A) If the dfn element has a title attribute, then the exact value of
that attribute is the term being defined. If the author includes a
value for the title attribute to indicate the term, the element must
be empty. [otherwise, we should specify what precisely the meaning of
the enclosed text is]
B) Otherwise, if it contains exactly one abbr element child only with
a title attribute set to a non-empty string, then the exact string
value of that attribute is the term defined.
C) Otherwise, it is the exact textContent of the dfn element that
gives the term being defined
Whether the term is contained in the title attribute of the dfn
element the title attribute of the enclosed abbr element or the
contents of the dfn element, it must only contain the term being
defined.
Also, there must only be one dfn element per document for each term
defined (i.e. there must not be any duplicate terms).
/end proposed new language
Replacing:
replaced text in current draft/
Defining term: If the dfn element has a title attribute, then the
exact value of that attribute is the term being defined. Otherwise,
if it contains exactly one element child node and no child text
nodes, and that child element is an abbr element with a title
attribute, then the exact value of that attribute is the term being
defined. Otherwise, it is the exact textContent of the dfn element
that gives the term being defined.
If the title attribute of the dfn element is present, then it must
only contain the term being defined.
There must only be one dfn element per document for each term defined
(i.e. there must not be any duplicate terms).
/ end replaced text in current draft
Consider adding the word "title" to the sentence: " If present, the
__title__ attribute must only contain an expansion of the
abbreviation." (for greater clarity)
Consider changing the note to read:
Note: __Except for the title attribute of a child abbr element, t__--
T--he title attribute of neither descendant nor ancestor elements
affects any dfn elements.
Consider changing the first (only) example to:
In the following fragment, the term "GDO" is first defined in the
first paragraph, then used in the second. A compliant UA must provide
a mechanism to present the definition from the immediately next
surrounding structural-inline ancestor or block ancestor of the dfn
element, whichever occurs first. In this example, the UA would
provide a user interacting with the abbr element in the second
paragraph a mechanism to view the entire first paragraph (perhaps by
scrolling to that p element or through an inspection panel).
<p>The <dfn><abbr title="Garage Door Opener">GDO</abbr></dfn>
is a device that allows off-world teams to open the iris.</p>
<!-- ... later in the document: -->
<p>Teal'c activated his <abbr title="Garage Door Opener">GDO</abbr>
and so Hammond ordered the iris to be opened.</p>
Consider providing examples for the other two state/cases mentioned
In the following fragment, the term "fundamental class process" is
defined in the first paragraph, then used in the second. A compliant
UA must provide a mechanism to present the definition from the
immediately next surrounding structural-inline ancestor or block
ancestor of the dfn element, whichever occurs first. In this
example, the UA would provide a user interacting with the phrase
"fundamental class process" contained in the span element in the
second paragraph a mechanism to view the entire first paragraph
(perhaps by scrolling to that p element or through an inspection panel).
<p>A <dfn>fundamental class process</dfn>
is a process of performing and appropriating surplus labor. It occurs
at the point of production between a performer of surplus labor and
the appropriator of that surplus labor.</p>
<!-- ... later in the document: -->
<p>… The <span title='fundamental class process' >fundamental class
process</span> is therefore a condition of existence for the subsumed
class process. …</p>
In the following fragment, the term "class" is defined in the first
paragraph, then used in the second. A compliant UA must provide a
mechanism to present the definition from the immediately next
surrounding structural-inline ancestor or block ancestor of the dfn
element, whichever occurs first. In this example, the UA would
provide a user interacting with the term "class" contained in the
span element in the second paragraph a mechanism to view the entire
first paragraph (perhaps by scrolling to that p element or through an
inspection panel).
<p>A <dfn>class</dfn> is a a factory for objects. It defines all of
the methods for accessing the instance variable data for an object as
well as defining the constructor that stamps out new instances of the
object. represented by this class.
.</p>
<!-- ... later in the document: -->
<p>For a <span title='class' >class</span> to be highly reusable,
developers should focus
their attention on opportunities for polymorphism.</p>
The VAR element makes mention of special meaning to @title when used
with DFN but nothing else is said about that here. An example and
explanation would be helpful. (is this meant to match the @title of
the defining dfn element for the variable?).
I think those changes and added examples will help make the current
draft spec clearer. However, I think we have an opportunity to make
this clearer and cleaner for authors to use and UAs to process. I
turn toward that proposal now.
Defining terms, abbreviations and
variables (DEFINE, TERM, ABBR, VAR) proposed changes:
----------------------------------
I like the more elaborate document conformance criteria the HTML5
draft provides for DFN and ABBR. However, right now DFN either
encloses text nodes that are the the definition of the term or, for
example, in the case of an abbreviation, it includes reflex the value
of the @title attribute. I think it has been even more difficult for
authors to keep the distinction between a defining instance of a term
distinct from the definition. Also, element names are important a the
very least for mnemonic reasons. However, the element name "dfn" is
very much a misnomer. It is used for a newly introduced term, but its
name suggests it is for the definition (presumably dfn is a short-
hand for "defining instance of a term"). This confusion over exactly
how to use a DFN element is one of the prime use-case/problem
statements for the following proposals: However, I'm also attempting
to provide a more consistent approach to defining terms,
abbreviations and variables. Finally, I want to reduce the verbosity
of the syntax required for these semantic elements.
The proposal involves the introduction off two new elements (DEFINE
and TERM) as well as several new attributes (@term, @abbr,
@casesensitive, @variantOf, @scope, @tphonetic, @aphonetic @phonetic,
@asword, and @spelt). Basically a DEFINE element is a definition that
defines either a term (TERM), an abbreviation (ABBR) or a variable
(VAR). This is the consistent overarching approach. The attributes
@term, @abb, @casesensitive, @scope, and @variantOf all help to match
either a definition (DEFINE) with its term, abbreviation or variable
or alternatively match a term (DFN) with its abbreviation (ABBR).
Since these attributes only appear once on the definition (DEFINE or
DFN), the terms, abbreviations and variables can be used any number
of times without adding needless repetitive syntax.
Because of the tightly integrated way these elements would work
together within this proposal (DEFINE, DFN, TERM, ABBR, and VAR), it
might make sense to introduce them all together before describing
each one individually.
/some proposed language
The definition element provides a mechanism to establish a precise
and elaborate definition and associate with a term (TERM),
abbreviation (ABBR), or variable (VAR). Authors need only use the
DEFINE element to define a term, abbreviation or variable. UAs will
provide a mechanism to indicate the definition of a term throughout
the rest o the document. (e.g., by presenting a tooltip or changing
the status bar when the pointing device hovers over a term within
scope) By including the @scope attribute, the scope of the
definition is restricted to the sectioning element (implicit? or
explicit) defined by the @scope attribute. For example, the scoped
attribute would facilitate a document containing multiple essays by
several authors all addressing the meaning of the same term.
Authors must include either the @term attribute or the @abbr
attribute on the DEFINE element to indicate what term/variable or
abbreviation the definition refers to respectively. The inclusion of
the casesensitive boolean attribute indicates that the matching of
the @term or @abbr attributes' value to the contents of DFN, TERM,
VAR and ABBR elements must be done in a case-sensitive manner.
Authors may use CSS or another styling mechanism to display only the
value of the @term attribute rather than the elaborated definition
(for example.,to leave the elaborated definition for an automatically
generated glossary).
UA conformance: If the @casesensitive attribute is false, UAs must
match the DEFINE element's @term and @abbr attribute values in a case-
insensitive manner. within the scope of the DEFINE element. If the
casesensitive attribute is true, UAs must match the DEFINE element's
@term and @abbr attributes' values in a case-sensitive manner. within
the scope of the DEFINE element. UAs should provide a mechanism to
let users quickly view the definition for a term, variable or
abbreviation while maintaining presentation of the current context of
the term. (e.g., by presenting a tooltip or changing the status bar
when the pointing device hovers over a term within scope or providing
an inspector panel).
/end some proposed language
DEFINE
----------------------------------
Consider adding a DEFINE element to provide greater structure to a
definition. In this way an author can define several terms in the
same paragraph or provide other exposition related to a newly
introduced term, that might not be considered part of the definition
proper.
So the primary use-case for proposing a new element is to provide
more explicit structure to a definition so that the UA or the user
does not have to interpret what part of the author's prose is the
definition. It also help simplify the matching of definitions with
their terms, variables and abbreviations. The structural element
containing a defining instance of a term, may also contain other
related terms or other prose that is not properly part of the terms
definition. The DEFINE element structures the definition in an
unambiguous way.
DEFINE (definition), includes all of the
| information needed for
| matching terms (TERM or DFN),
| abbreviations (ABBR) or
| variables (VAR)
^ --------- TERM or VAR or
^ --------- DFN (abbreviation defining term) includes all the
| | information needed for
| @abbr | matching abbreviations
^ --------- ^ ------ abbreviation (ABBR)
The define element is a structural inline or strictly inline level
element.
Content model: strictly-inline or structure-inline level elements
Element-specific attributes:
Matching Attributes:
@term (string; required; the term or variable defined in the DEFINE
element and matched with all occurrences of DFN, VAR or TERM)
@abbr (string; required; indicates an abbreviation for
the specified term contents of the term element)
@casesensitive (boolean; indicates
that the string value of the @term attribute
is matched case-sensitively with the contents
of TERM, ABBR or VAR elements)
@scope (integer; indicates the definition only applies to
the current sectioning (1) or the parent section (2) or
higher level ancestor sections (n) and all descendant
sections of that scope section)
Pronunciation Attributes:
@tphonetic (string; provides a pronunciation hint
for the term in @term, using Unicode
phonetic characters[1] )
@aphonetic (string; provides a pronunciation hint
for the abbreviation in @abbr, using Unicode
phonetic characters[1] )
@spelt (boolean; indicates the @abbr value should be spelled out:
to assist with pronunciation)
@asword (boolean; indicates the @abbr value should be pronounced
as a word rather than being spelled out)
The first four attributes provide most of the information the UA
needs to match the definition with all associated instances of the
term (TERM or DFN), abbreviation (ABBR), or variable (VAR) within the
scope of the definition. The @tphonetic attribute provides phonetic
information for end-users or speaking UAs to pronounce the associated
@term value. Though the name "term" is used on the DEFINE element,
this definition can be associated with a term, or variable. The
@aphonetic attribute provides phonetic information for the associated
abbreviation (either directly by setting @abbr on the DEFINE element
or indirectly by associating a DEFINE element with a DFN element
that has an abbreviation defining instance (DFN) pointing to an
abbreviation (@abbr). The @tphonetic and @aphonetic attributes
provide phonetic information for end-users or speaking UAs to
pronounce the associated @term or @abbr values respectively.
Alternatively, authors can use the @asword or @spelt to halp provide
phonetic information. When the boolean @asword, is true, a speaking
UA should attempt to pronounce the abbreviation as if it were a word
(e.g., with "CentCom" or "UNICEF"). The attribute @spelt provides a
hint that the abbreviation is to be spelled out (e.g., with "HTML" or
"XML"). It is an error for authors to set both @aswrod and @spelt to
true. UAs might make use of stylesheets including user stylesheets to
access pronunciation dictionairies that override the author
pronunciation. For example <dfn spelt=''spelt' >SQL</dfn>, might be
overridden by a user stylesheet to be spoken as: "sequel".
Terms (DFN and TERM)
----------------------------------
Consider adding a TERM element to both replace DFN (as a defining
instance of a term) and provide an explicit element instead of using
a SPAN element. After defining a term, an author may use the TERM
element any number of times within the same document (or section or
subsection for scoped definitions). DFN would remain as a special use
of TERM to introduce of a term that also defines an associated
abbreviation (ABBR).
TERM element
Element-specific attributes:
New Element-specific attributes:
Matching Attributes:
@abbr (string; required; indicates an abbreviation for
the specified term contents of the term element)
@casesensitive indicates that the
string value of the @abbr attribute
is matched case-sensitively with the contents
of ABBR elements)
@scope (integer; indicates the definition only applies to
the current sectioning (1) or the parent section (2) or
higher level ancestor sections (n) and all descendant
sections of that scope section)
@variantOf (string; used to match the
term to the definition for variants such
as plurals: e.g., <dfn variantOf='class'>classes</dfn>)
Pronunciation Attributes:
@phonetic (string; provides a pronunciation hint for the enclosed term
using Unicode phonetic characters[1])
@aphonetic (string; provides a pronunciation hint for the @abbr
attribute
using Unicode phonetic characters[1])
@spelt (boolean; indicates the @abbr value should be spelled out:
to assist with pronunciation)
@asword (boolean; indicates the @abbr value should be pronounced
as a word rather than being spelled out)
The TERM element is a phrase strictly-inline level element.
Content model: strictly-inline
DFN
----------------------------------
Consider adding attributes to the DFN element to provide a similar
matching mechanism as that with DEFINE and TERM, but for DFN and ABBR.
New Element-specific attributes:
Matching Attributes:
@abbr (string; required; indicates an abbreviation for
the specified term contents of the term element)
@casesensitive indicates that the
string value of the @abbr attribute
is matched case-sensitively with the contents
of ABBR elements)
@scope (integer; indicates the definition only applies to
the current sectioning (1) or the parent section (2) or
higher level ancestor sections (n) and all descendant
sections of that scope section)
@variantOf (string; used to match the
term to the definition for variants such
as plurals: e.g., <dfn variantOf='class'>classes</dfn>)
Pronunciation Attributes:
@phonetic (string; provides a pronunciation hint for the @abbr
attribute
using Unicode phonetic characters[1])
@spelt (boolean; indicates the @abbr value should be spelled out:
to assist with pronunciation)
@asword (boolean; indicates the @abbr value should be pronounced
as a word rather than being spelled out)
I do not think it would it break compatibility to introduce either a
TERM or DEFINE element: since even though its default styling would
only work with an embedded/linked stylesheet of a HTML5 conforming UA
with a proper default stylesheet, the default styling would likely
not need to differ from the surrounding text. Authors could at their
own option style terms, variables, abbreviations, definitions or
first instance of a term in a special way, but there is no need to by
default.
proposed language/
Authors use the TERM element to contain important terms or key words
that might be important for an document or section index in a printed
version of the document or indicate greater importance for a search
engine. For example, imagine an author who wants to write an essay on
the word "and" and the ways that word has been used throughout
English literature. By marking <term>and</term> inside a term element
it will indicate that this is a word being use in a specialized way
an not in its more conventional way. Another example, consider the
use of a term like <term>service</term> where it is being used to
refer to a daemon running on a host hardware server. The term has a
more colloquial meaning and the same document may even use the term
in this colloquial sense of "providing a service to our customers by
installing our <term>service</term> on their hardware."
/end proposed language
In this way authors may still use DFN for the defining instance of a
term, occurring nearby the DEFINE element while the @term attribute
on the DEFINE element would provide a mechanism to match the DEFINE
with the first DFN occurrence and all subsequent TERM occurrences.
Authors could still use the @title attribute on abbr, var, and term
elements for legacy compatibility, but for HTML5 UAs, those @title
attributes would no longer be needed for this specialized purpose.
Instead, interactive UAs would provide a mechanism to access the
definition of a term or variable or the expansion of an abbreviation
from any instance of those elements throughout the document . The
@title attribute would therefore become just like the @title
attribute on any other element: available for general use..
With the DEFINE element, authors establish a definition to be
associated with every instance of the DFN, TERM or VAR element (and
optionally every ABBR element) containing the defined term. With the
DFN element, authors establish an expanded word or phrase to be
associated with every instance of the ABBR element containing the
referenced abbreviation. Typically, the matching and phonetics for
every instance in the document will be defined once for each term or
abbreviation in a single DEFINE of DFN element. However, there are
facilities to attach phonetics, independently, to the ABBR, TERM and
VAR elements. Also through use of the @variantOf attribute, authors
can handle variants such as plurals, possessives and other variants
and still facilitate a UA match.
Abbreviation (ABBR) consider adding attributes:
----------------------------------
New Element-specific attributes:
Matching Attributes:
• @variantOf (string; used to match the
term to the definition for variants such
as plurals: e.g., <abbr variantOf='Zat'>Zats</abbr>)
Pronunciation Attributes:
@phonetic (string; provides a pronunciation hint for the @abbr
attribute
using Unicode phonetic characters[1])
@spelt (boolean; indicates the @abbr value should be spelled out:
to assist with pronunciation)
@asword (boolean; indicates the @abbr value should be pronounced
as a word rather than being spelled out)
The attributes specific to ABBR are. 1) an @spelt boolean
attribute.for pronunciation hint. 2) an @asword attribute for a
pronunciation hint to read the abbreviation as a word rather than
spelling out the letters (what some variants of English call an
acronym). 3) a @phonetic attribute to contain specific phonetic
Unicode character or character references as a pronunciation hint
(Unicode could use some improvement on phonetic characters, but this
mechanism should work today and be forward compatible even as Unicode
improved its organization of phonetic characters)[1]. However, these
attributes can simply be used on the DFN and DEFINE elements for the
first time the abbreviation is defined. In this way, the abbreviation
can be repeated throughout the document without needing to repeat
these attributes.
In the current draft, the last example for ABBR especially
underscores the problem where the "term" is doubly markedup with both
DFN and ABBR and the definition has no markup at all except for the
surrounding paragraph (which is only due to the particularly
contrived example; the paragraph could easily be longer). Based on my
proposal, the definition would be marked up with the DEFINE element
and the term would be marked up with the DFN element. UAs would then
have all the information needed to match abbreviation with term with
definition.
The draft currently reads:
"In the example below, the word "Zat" is used as an abbreviation in
the second paragraph. The abbreviation is defined in the first, so
the explanatory title attribute has been omitted. Because of the way
dfn elements are defined, the second abbr element in this example
would be connected (in some UA-specific way) to the first.
<p>The <dfn><abbr>Zat</abbr></dfn>, short for Zat'ni'catel, is a
weapon.</p>
<p>Jack used a <abbr>Zat</abbr> to make the boxes of evidence
disappear.</p>"
My proposal would change this to:
<p>The <abbr>Zat</abbr>,short for <dfine term="Zat'ni'catel" > <dfn
abbr='Zat'>Zat'ni'catel</dfn>, is a weapon. </dfn> Weaponry, in
general, has long been an important export industry in this region.</p>
<p>Jack used a <abbr>Zat</abbr> to make the boxes of evidence
disappear.</p>"
With this approach the, association is clear and the UA is not left
to infer what the association is based on a paragraph that may
contain less or more than the precise definition. With the current
draft, the information on the weapons industry that is unrelated to
the definition of Zat would be included by UAs in the definition.
Variables (VAR):
----------------------------------
I propose variable be treated much like the proposed TERM element but
specifically for use in mathematics, computer programming or formal
logic VAR would be for the more particular use as a variable denoting
or pointing to either a specific object — however narrowly or broadly
defined — or to a class of objects. In the latter case, the
distinction with TERM is subtle, but I think there are cultural
reasons to support both elements. After all consider the term
automobile as a term that denotes the class of "all self-contained
power-plant vehicles with facilities to transport humans and cargo."
We could instead think of that as a variable for that class. However,
I think the use of variable is more specialized, often relating to
more abstract and formal objects and classes. So I think the
distinction is useful to maintain in markup.
So we should supplement our description of variable by discussing
more modern programming constructs like: "element", "attribute",
"property", "class", "instance", "object", "realnumber",
"proposition", "float", "char", "method", "function", etc. Adding a
@type attribute to variable would also help support the VAR element's
use in this way. Similarly, discussing the subtle and not so subtle
differences between VAR and TERM will help orient authors to their
proper use.
Consider adding a @type attribute to VAR so that authors can express
the precise kind of instance the variable points to. For example:
<p>In the following we treat <var>myElement</var> as always equal to
<define>the element returned by getElementByIdNS("myID", svg)</
define>..</p>
This could be useful for VAR elements that are not associated with a
DEFINE element or for when the variable is used in a more abstract
way. For contrast consider the example:
<p>Consider a circle with circumference <var type='realnumber'>x</
var>.</p> and circumscribe a circle with circumference <var
type='realnumber'>x</var>. around …</p>
versus
<p>Consider a circle with circumference <var>x</var> <define term='x'
casesensitive='casesensitive' >a real number</define>.</p> and
circumscribe a circle with circumference <var type='realnumber'>x</
var>, also <define term='y' casesensitive='casesensitive' > a real
number</define>. around …</p>
The first example provides more detail than it could otherwise
provide without needing to resort to a definition. It could
alternately be marked up with:
UI element
----------------------------------
It might be useful to introduce another element for other modern
graphical constructs that could be subsumed under VAR or handled with
another element would be: "view", "control", "cell", "menu",
"menuitem", "button", etc. But we may want to include those too among
the constructs that might also be appropriately marked-up with VAR as
the VAR would be referring to a specific instance of a button or
control (on the screen for example). An element such as UI might be
appropriate: also with a @type attribute taking a QNAME. This would
be an element for discussing various UI and not an element to include
UI in the HTML document (unlike INPUT or BUTTON).
Such an element would more compliment the SAMP and KBD elements as a
specialized form of
QNAMES for VAR @ype
----------------------------------
If HTML could provide a keyword value list for many common types of
instances used in formal logic, mathematics and computer science
authors would have a shortcut for expressing those types. It would
eliminate the need for a definition in many cases. It could also be
used for more specialized conformance checkers that might be able to
detect errors in the use of a VAR. Types might include:
"proposition", "set", "element", "attribute", "property", "class",
"instance", "object", "realnumber", "proposition", "float", "char",
"method", "function". Authors making use of one or more types could
then apply author style sheets to differentiate variables of
different type.
Sample code for newly proposed facilities
----------------------------------
For a sample code contrasting the current draft approach with the
proposed approach see the attached[2] example document. This document
should be well-formed and has been tested on the live DOM viewer[3].
This sample shows that the parsing works with current browsers.
Authors would have to continue to use @title for tooltip
functionality until HTML5 compliant browsers provided proper matching
between definitions and their associated terms, variables and
abbreviations.
Notes
----------------------------------
[1]: Unicode currently provides little consistent data on the
phonetic properties of characters. Perhaps W3C could liaison with
Unicode over the issue of establishing semantic phonetic Unicode
characters. Alternatively we could provide a PHONETIC or
PRONUNCIATION element to contain SSML markup. However, Unicode could
be enhanced to support plain-text phonetics with some minor changes.
[2]: For an example of current rendering of many of these elements
see the attached:
This attachment demonstrates how many of these semantics are not necessarily supported through UA default stylesheets. They are nevertheless very useful to authors who will tend to author with both HTML and CSS in unison. [3]: <http://software.hixie.ch/utilities/js/live-dom-viewer/>
Attachments
- text/html attachment: DefineExample.html
Received on Monday, 23 July 2007 16:24:30 UTC