- From: Robert Burns <rob@robburns.com>
- Date: Mon, 23 Jul 2007 11:24:18 -0500
- To: public-html WG <public-html@w3.org>
- Message-Id: <F72BF1FA-BC13-4484-BD51-A4AA3250F728@robburns.com>
HIGHLIGHTS/MARK, DEFINITIONS, TERMS,ABBREVIATIONS, AND VARIABLES: M, DFN, TERM, ABBR (part of my review of 3.12 Phrase elements) Preface ---------------------------------- My original review of these subsections included some mistakes in my reading of the current draft (along with forgetting how DFN was specified in HTML4). I encourage everyone to ignore that original review and to focus on this one instead. The ensuing dialog has been useful in reshaping this. Since, my new version calls for some even more sweeping changes than my last review, I'll do two things in this review. First, I'll discuss eidtorial changes to improve the draft for the existing approach with these elements. Second, I will elaborate how I think the introduction of some new elements in favor of the old elements can improve the situation markedly. Summary: ---------------------------------- * Propose some editorial changes to the draft to improve the exposition for the current proposals. * Propose alternate approach with new elements: TERM, and DEFINE * For use on mostly on DEFINE, but also on DFN, TERM, ABBR and VAR propose new attributes: Matching attributes • @term (string), • @abbr (string), • @scope (xpath), • @casesensitive (boolean), • @variantOf (string; used on terms, abbreviations, and variables for matching), Phonetic attributes • @aphonetic (string), • @tphonetic (string), • @phonetic (string), • @asword (boolean), • @spelt (boolean) The proposed enhancements are meant to deal with the problem of providing a more rigorous and consistent markup for terms, abbreviations and variables used in a document. The proposed enhancements should support auto-generation of a document index, a document glossary and interactive discovery of terms, abbreviations and variables used within an interactive UA. Highlight /Mark (M): ---------------------------------- This section looks good. I have no suggestions for improvement. Defining terms, abbreviations and variable (DFN, ABBR, VAR) for the current draft: ---------------------------------- Consider changing the current draft for DFN to: proposed new language/ Defining term: A) If the dfn element has a title attribute, then the exact value of that attribute is the term being defined. If the author includes a value for the title attribute to indicate the term, the element must be empty. [otherwise, we should specify what precisely the meaning of the enclosed text is] B) Otherwise, if it contains exactly one abbr element child only with a title attribute set to a non-empty string, then the exact string value of that attribute is the term defined. C) Otherwise, it is the exact textContent of the dfn element that gives the term being defined Whether the term is contained in the title attribute of the dfn element the title attribute of the enclosed abbr element or the contents of the dfn element, it must only contain the term being defined. Also, there must only be one dfn element per document for each term defined (i.e. there must not be any duplicate terms). /end proposed new language Replacing: replaced text in current draft/ Defining term: If the dfn element has a title attribute, then the exact value of that attribute is the term being defined. Otherwise, if it contains exactly one element child node and no child text nodes, and that child element is an abbr element with a title attribute, then the exact value of that attribute is the term being defined. Otherwise, it is the exact textContent of the dfn element that gives the term being defined. If the title attribute of the dfn element is present, then it must only contain the term being defined. There must only be one dfn element per document for each term defined (i.e. there must not be any duplicate terms). / end replaced text in current draft Consider adding the word "title" to the sentence: " If present, the __title__ attribute must only contain an expansion of the abbreviation." (for greater clarity) Consider changing the note to read: Note: __Except for the title attribute of a child abbr element, t__-- T--he title attribute of neither descendant nor ancestor elements affects any dfn elements. Consider changing the first (only) example to: In the following fragment, the term "GDO" is first defined in the first paragraph, then used in the second. A compliant UA must provide a mechanism to present the definition from the immediately next surrounding structural-inline ancestor or block ancestor of the dfn element, whichever occurs first. In this example, the UA would provide a user interacting with the abbr element in the second paragraph a mechanism to view the entire first paragraph (perhaps by scrolling to that p element or through an inspection panel). <p>The <dfn><abbr title="Garage Door Opener">GDO</abbr></dfn> is a device that allows off-world teams to open the iris.</p> <!-- ... later in the document: --> <p>Teal'c activated his <abbr title="Garage Door Opener">GDO</abbr> and so Hammond ordered the iris to be opened.</p> Consider providing examples for the other two state/cases mentioned In the following fragment, the term "fundamental class process" is defined in the first paragraph, then used in the second. A compliant UA must provide a mechanism to present the definition from the immediately next surrounding structural-inline ancestor or block ancestor of the dfn element, whichever occurs first. In this example, the UA would provide a user interacting with the phrase "fundamental class process" contained in the span element in the second paragraph a mechanism to view the entire first paragraph (perhaps by scrolling to that p element or through an inspection panel). <p>A <dfn>fundamental class process</dfn> is a process of performing and appropriating surplus labor. It occurs at the point of production between a performer of surplus labor and the appropriator of that surplus labor.</p> <!-- ... later in the document: --> <p>… The <span title='fundamental class process' >fundamental class process</span> is therefore a condition of existence for the subsumed class process. …</p> In the following fragment, the term "class" is defined in the first paragraph, then used in the second. A compliant UA must provide a mechanism to present the definition from the immediately next surrounding structural-inline ancestor or block ancestor of the dfn element, whichever occurs first. In this example, the UA would provide a user interacting with the term "class" contained in the span element in the second paragraph a mechanism to view the entire first paragraph (perhaps by scrolling to that p element or through an inspection panel). <p>A <dfn>class</dfn> is a a factory for objects. It defines all of the methods for accessing the instance variable data for an object as well as defining the constructor that stamps out new instances of the object. represented by this class. .</p> <!-- ... later in the document: --> <p>For a <span title='class' >class</span> to be highly reusable, developers should focus their attention on opportunities for polymorphism.</p> The VAR element makes mention of special meaning to @title when used with DFN but nothing else is said about that here. An example and explanation would be helpful. (is this meant to match the @title of the defining dfn element for the variable?). I think those changes and added examples will help make the current draft spec clearer. However, I think we have an opportunity to make this clearer and cleaner for authors to use and UAs to process. I turn toward that proposal now. Defining terms, abbreviations and variables (DEFINE, TERM, ABBR, VAR) proposed changes: ---------------------------------- I like the more elaborate document conformance criteria the HTML5 draft provides for DFN and ABBR. However, right now DFN either encloses text nodes that are the the definition of the term or, for example, in the case of an abbreviation, it includes reflex the value of the @title attribute. I think it has been even more difficult for authors to keep the distinction between a defining instance of a term distinct from the definition. Also, element names are important a the very least for mnemonic reasons. However, the element name "dfn" is very much a misnomer. It is used for a newly introduced term, but its name suggests it is for the definition (presumably dfn is a short- hand for "defining instance of a term"). This confusion over exactly how to use a DFN element is one of the prime use-case/problem statements for the following proposals: However, I'm also attempting to provide a more consistent approach to defining terms, abbreviations and variables. Finally, I want to reduce the verbosity of the syntax required for these semantic elements. The proposal involves the introduction off two new elements (DEFINE and TERM) as well as several new attributes (@term, @abbr, @casesensitive, @variantOf, @scope, @tphonetic, @aphonetic @phonetic, @asword, and @spelt). Basically a DEFINE element is a definition that defines either a term (TERM), an abbreviation (ABBR) or a variable (VAR). This is the consistent overarching approach. The attributes @term, @abb, @casesensitive, @scope, and @variantOf all help to match either a definition (DEFINE) with its term, abbreviation or variable or alternatively match a term (DFN) with its abbreviation (ABBR). Since these attributes only appear once on the definition (DEFINE or DFN), the terms, abbreviations and variables can be used any number of times without adding needless repetitive syntax. Because of the tightly integrated way these elements would work together within this proposal (DEFINE, DFN, TERM, ABBR, and VAR), it might make sense to introduce them all together before describing each one individually. /some proposed language The definition element provides a mechanism to establish a precise and elaborate definition and associate with a term (TERM), abbreviation (ABBR), or variable (VAR). Authors need only use the DEFINE element to define a term, abbreviation or variable. UAs will provide a mechanism to indicate the definition of a term throughout the rest o the document. (e.g., by presenting a tooltip or changing the status bar when the pointing device hovers over a term within scope) By including the @scope attribute, the scope of the definition is restricted to the sectioning element (implicit? or explicit) defined by the @scope attribute. For example, the scoped attribute would facilitate a document containing multiple essays by several authors all addressing the meaning of the same term. Authors must include either the @term attribute or the @abbr attribute on the DEFINE element to indicate what term/variable or abbreviation the definition refers to respectively. The inclusion of the casesensitive boolean attribute indicates that the matching of the @term or @abbr attributes' value to the contents of DFN, TERM, VAR and ABBR elements must be done in a case-sensitive manner. Authors may use CSS or another styling mechanism to display only the value of the @term attribute rather than the elaborated definition (for example.,to leave the elaborated definition for an automatically generated glossary). UA conformance: If the @casesensitive attribute is false, UAs must match the DEFINE element's @term and @abbr attribute values in a case- insensitive manner. within the scope of the DEFINE element. If the casesensitive attribute is true, UAs must match the DEFINE element's @term and @abbr attributes' values in a case-sensitive manner. within the scope of the DEFINE element. UAs should provide a mechanism to let users quickly view the definition for a term, variable or abbreviation while maintaining presentation of the current context of the term. (e.g., by presenting a tooltip or changing the status bar when the pointing device hovers over a term within scope or providing an inspector panel). /end some proposed language DEFINE ---------------------------------- Consider adding a DEFINE element to provide greater structure to a definition. In this way an author can define several terms in the same paragraph or provide other exposition related to a newly introduced term, that might not be considered part of the definition proper. So the primary use-case for proposing a new element is to provide more explicit structure to a definition so that the UA or the user does not have to interpret what part of the author's prose is the definition. It also help simplify the matching of definitions with their terms, variables and abbreviations. The structural element containing a defining instance of a term, may also contain other related terms or other prose that is not properly part of the terms definition. The DEFINE element structures the definition in an unambiguous way. DEFINE (definition), includes all of the | information needed for | matching terms (TERM or DFN), | abbreviations (ABBR) or | variables (VAR) ^ --------- TERM or VAR or ^ --------- DFN (abbreviation defining term) includes all the | | information needed for | @abbr | matching abbreviations ^ --------- ^ ------ abbreviation (ABBR) The define element is a structural inline or strictly inline level element. Content model: strictly-inline or structure-inline level elements Element-specific attributes: Matching Attributes: @term (string; required; the term or variable defined in the DEFINE element and matched with all occurrences of DFN, VAR or TERM) @abbr (string; required; indicates an abbreviation for the specified term contents of the term element) @casesensitive (boolean; indicates that the string value of the @term attribute is matched case-sensitively with the contents of TERM, ABBR or VAR elements) @scope (integer; indicates the definition only applies to the current sectioning (1) or the parent section (2) or higher level ancestor sections (n) and all descendant sections of that scope section) Pronunciation Attributes: @tphonetic (string; provides a pronunciation hint for the term in @term, using Unicode phonetic characters[1] ) @aphonetic (string; provides a pronunciation hint for the abbreviation in @abbr, using Unicode phonetic characters[1] ) @spelt (boolean; indicates the @abbr value should be spelled out: to assist with pronunciation) @asword (boolean; indicates the @abbr value should be pronounced as a word rather than being spelled out) The first four attributes provide most of the information the UA needs to match the definition with all associated instances of the term (TERM or DFN), abbreviation (ABBR), or variable (VAR) within the scope of the definition. The @tphonetic attribute provides phonetic information for end-users or speaking UAs to pronounce the associated @term value. Though the name "term" is used on the DEFINE element, this definition can be associated with a term, or variable. The @aphonetic attribute provides phonetic information for the associated abbreviation (either directly by setting @abbr on the DEFINE element or indirectly by associating a DEFINE element with a DFN element that has an abbreviation defining instance (DFN) pointing to an abbreviation (@abbr). The @tphonetic and @aphonetic attributes provide phonetic information for end-users or speaking UAs to pronounce the associated @term or @abbr values respectively. Alternatively, authors can use the @asword or @spelt to halp provide phonetic information. When the boolean @asword, is true, a speaking UA should attempt to pronounce the abbreviation as if it were a word (e.g., with "CentCom" or "UNICEF"). The attribute @spelt provides a hint that the abbreviation is to be spelled out (e.g., with "HTML" or "XML"). It is an error for authors to set both @aswrod and @spelt to true. UAs might make use of stylesheets including user stylesheets to access pronunciation dictionairies that override the author pronunciation. For example <dfn spelt=''spelt' >SQL</dfn>, might be overridden by a user stylesheet to be spoken as: "sequel". Terms (DFN and TERM) ---------------------------------- Consider adding a TERM element to both replace DFN (as a defining instance of a term) and provide an explicit element instead of using a SPAN element. After defining a term, an author may use the TERM element any number of times within the same document (or section or subsection for scoped definitions). DFN would remain as a special use of TERM to introduce of a term that also defines an associated abbreviation (ABBR). TERM element Element-specific attributes: New Element-specific attributes: Matching Attributes: @abbr (string; required; indicates an abbreviation for the specified term contents of the term element) @casesensitive indicates that the string value of the @abbr attribute is matched case-sensitively with the contents of ABBR elements) @scope (integer; indicates the definition only applies to the current sectioning (1) or the parent section (2) or higher level ancestor sections (n) and all descendant sections of that scope section) @variantOf (string; used to match the term to the definition for variants such as plurals: e.g., <dfn variantOf='class'>classes</dfn>) Pronunciation Attributes: @phonetic (string; provides a pronunciation hint for the enclosed term using Unicode phonetic characters[1]) @aphonetic (string; provides a pronunciation hint for the @abbr attribute using Unicode phonetic characters[1]) @spelt (boolean; indicates the @abbr value should be spelled out: to assist with pronunciation) @asword (boolean; indicates the @abbr value should be pronounced as a word rather than being spelled out) The TERM element is a phrase strictly-inline level element. Content model: strictly-inline DFN ---------------------------------- Consider adding attributes to the DFN element to provide a similar matching mechanism as that with DEFINE and TERM, but for DFN and ABBR. New Element-specific attributes: Matching Attributes: @abbr (string; required; indicates an abbreviation for the specified term contents of the term element) @casesensitive indicates that the string value of the @abbr attribute is matched case-sensitively with the contents of ABBR elements) @scope (integer; indicates the definition only applies to the current sectioning (1) or the parent section (2) or higher level ancestor sections (n) and all descendant sections of that scope section) @variantOf (string; used to match the term to the definition for variants such as plurals: e.g., <dfn variantOf='class'>classes</dfn>) Pronunciation Attributes: @phonetic (string; provides a pronunciation hint for the @abbr attribute using Unicode phonetic characters[1]) @spelt (boolean; indicates the @abbr value should be spelled out: to assist with pronunciation) @asword (boolean; indicates the @abbr value should be pronounced as a word rather than being spelled out) I do not think it would it break compatibility to introduce either a TERM or DEFINE element: since even though its default styling would only work with an embedded/linked stylesheet of a HTML5 conforming UA with a proper default stylesheet, the default styling would likely not need to differ from the surrounding text. Authors could at their own option style terms, variables, abbreviations, definitions or first instance of a term in a special way, but there is no need to by default. proposed language/ Authors use the TERM element to contain important terms or key words that might be important for an document or section index in a printed version of the document or indicate greater importance for a search engine. For example, imagine an author who wants to write an essay on the word "and" and the ways that word has been used throughout English literature. By marking <term>and</term> inside a term element it will indicate that this is a word being use in a specialized way an not in its more conventional way. Another example, consider the use of a term like <term>service</term> where it is being used to refer to a daemon running on a host hardware server. The term has a more colloquial meaning and the same document may even use the term in this colloquial sense of "providing a service to our customers by installing our <term>service</term> on their hardware." /end proposed language In this way authors may still use DFN for the defining instance of a term, occurring nearby the DEFINE element while the @term attribute on the DEFINE element would provide a mechanism to match the DEFINE with the first DFN occurrence and all subsequent TERM occurrences. Authors could still use the @title attribute on abbr, var, and term elements for legacy compatibility, but for HTML5 UAs, those @title attributes would no longer be needed for this specialized purpose. Instead, interactive UAs would provide a mechanism to access the definition of a term or variable or the expansion of an abbreviation from any instance of those elements throughout the document . The @title attribute would therefore become just like the @title attribute on any other element: available for general use.. With the DEFINE element, authors establish a definition to be associated with every instance of the DFN, TERM or VAR element (and optionally every ABBR element) containing the defined term. With the DFN element, authors establish an expanded word or phrase to be associated with every instance of the ABBR element containing the referenced abbreviation. Typically, the matching and phonetics for every instance in the document will be defined once for each term or abbreviation in a single DEFINE of DFN element. However, there are facilities to attach phonetics, independently, to the ABBR, TERM and VAR elements. Also through use of the @variantOf attribute, authors can handle variants such as plurals, possessives and other variants and still facilitate a UA match. Abbreviation (ABBR) consider adding attributes: ---------------------------------- New Element-specific attributes: Matching Attributes: • @variantOf (string; used to match the term to the definition for variants such as plurals: e.g., <abbr variantOf='Zat'>Zats</abbr>) Pronunciation Attributes: @phonetic (string; provides a pronunciation hint for the @abbr attribute using Unicode phonetic characters[1]) @spelt (boolean; indicates the @abbr value should be spelled out: to assist with pronunciation) @asword (boolean; indicates the @abbr value should be pronounced as a word rather than being spelled out) The attributes specific to ABBR are. 1) an @spelt boolean attribute.for pronunciation hint. 2) an @asword attribute for a pronunciation hint to read the abbreviation as a word rather than spelling out the letters (what some variants of English call an acronym). 3) a @phonetic attribute to contain specific phonetic Unicode character or character references as a pronunciation hint (Unicode could use some improvement on phonetic characters, but this mechanism should work today and be forward compatible even as Unicode improved its organization of phonetic characters)[1]. However, these attributes can simply be used on the DFN and DEFINE elements for the first time the abbreviation is defined. In this way, the abbreviation can be repeated throughout the document without needing to repeat these attributes. In the current draft, the last example for ABBR especially underscores the problem where the "term" is doubly markedup with both DFN and ABBR and the definition has no markup at all except for the surrounding paragraph (which is only due to the particularly contrived example; the paragraph could easily be longer). Based on my proposal, the definition would be marked up with the DEFINE element and the term would be marked up with the DFN element. UAs would then have all the information needed to match abbreviation with term with definition. The draft currently reads: "In the example below, the word "Zat" is used as an abbreviation in the second paragraph. The abbreviation is defined in the first, so the explanatory title attribute has been omitted. Because of the way dfn elements are defined, the second abbr element in this example would be connected (in some UA-specific way) to the first. <p>The <dfn><abbr>Zat</abbr></dfn>, short for Zat'ni'catel, is a weapon.</p> <p>Jack used a <abbr>Zat</abbr> to make the boxes of evidence disappear.</p>" My proposal would change this to: <p>The <abbr>Zat</abbr>,short for <dfine term="Zat'ni'catel" > <dfn abbr='Zat'>Zat'ni'catel</dfn>, is a weapon. </dfn> Weaponry, in general, has long been an important export industry in this region.</p> <p>Jack used a <abbr>Zat</abbr> to make the boxes of evidence disappear.</p>" With this approach the, association is clear and the UA is not left to infer what the association is based on a paragraph that may contain less or more than the precise definition. With the current draft, the information on the weapons industry that is unrelated to the definition of Zat would be included by UAs in the definition. Variables (VAR): ---------------------------------- I propose variable be treated much like the proposed TERM element but specifically for use in mathematics, computer programming or formal logic VAR would be for the more particular use as a variable denoting or pointing to either a specific object — however narrowly or broadly defined — or to a class of objects. In the latter case, the distinction with TERM is subtle, but I think there are cultural reasons to support both elements. After all consider the term automobile as a term that denotes the class of "all self-contained power-plant vehicles with facilities to transport humans and cargo." We could instead think of that as a variable for that class. However, I think the use of variable is more specialized, often relating to more abstract and formal objects and classes. So I think the distinction is useful to maintain in markup. So we should supplement our description of variable by discussing more modern programming constructs like: "element", "attribute", "property", "class", "instance", "object", "realnumber", "proposition", "float", "char", "method", "function", etc. Adding a @type attribute to variable would also help support the VAR element's use in this way. Similarly, discussing the subtle and not so subtle differences between VAR and TERM will help orient authors to their proper use. Consider adding a @type attribute to VAR so that authors can express the precise kind of instance the variable points to. For example: <p>In the following we treat <var>myElement</var> as always equal to <define>the element returned by getElementByIdNS("myID", svg)</ define>..</p> This could be useful for VAR elements that are not associated with a DEFINE element or for when the variable is used in a more abstract way. For contrast consider the example: <p>Consider a circle with circumference <var type='realnumber'>x</ var>.</p> and circumscribe a circle with circumference <var type='realnumber'>x</var>. around …</p> versus <p>Consider a circle with circumference <var>x</var> <define term='x' casesensitive='casesensitive' >a real number</define>.</p> and circumscribe a circle with circumference <var type='realnumber'>x</ var>, also <define term='y' casesensitive='casesensitive' > a real number</define>. around …</p> The first example provides more detail than it could otherwise provide without needing to resort to a definition. It could alternately be marked up with: UI element ---------------------------------- It might be useful to introduce another element for other modern graphical constructs that could be subsumed under VAR or handled with another element would be: "view", "control", "cell", "menu", "menuitem", "button", etc. But we may want to include those too among the constructs that might also be appropriately marked-up with VAR as the VAR would be referring to a specific instance of a button or control (on the screen for example). An element such as UI might be appropriate: also with a @type attribute taking a QNAME. This would be an element for discussing various UI and not an element to include UI in the HTML document (unlike INPUT or BUTTON). Such an element would more compliment the SAMP and KBD elements as a specialized form of QNAMES for VAR @ype ---------------------------------- If HTML could provide a keyword value list for many common types of instances used in formal logic, mathematics and computer science authors would have a shortcut for expressing those types. It would eliminate the need for a definition in many cases. It could also be used for more specialized conformance checkers that might be able to detect errors in the use of a VAR. Types might include: "proposition", "set", "element", "attribute", "property", "class", "instance", "object", "realnumber", "proposition", "float", "char", "method", "function". Authors making use of one or more types could then apply author style sheets to differentiate variables of different type. Sample code for newly proposed facilities ---------------------------------- For a sample code contrasting the current draft approach with the proposed approach see the attached[2] example document. This document should be well-formed and has been tested on the live DOM viewer[3]. This sample shows that the parsing works with current browsers. Authors would have to continue to use @title for tooltip functionality until HTML5 compliant browsers provided proper matching between definitions and their associated terms, variables and abbreviations. Notes ---------------------------------- [1]: Unicode currently provides little consistent data on the phonetic properties of characters. Perhaps W3C could liaison with Unicode over the issue of establishing semantic phonetic Unicode characters. Alternatively we could provide a PHONETIC or PRONUNCIATION element to contain SSML markup. However, Unicode could be enhanced to support plain-text phonetics with some minor changes. [2]: For an example of current rendering of many of these elements see the attached:
This attachment demonstrates how many of these semantics are not necessarily supported through UA default stylesheets. They are nevertheless very useful to authors who will tend to author with both HTML and CSS in unison. [3]: <http://software.hixie.ch/utilities/js/live-dom-viewer/>
Attachments
- text/html attachment: DefineExample.html
Received on Monday, 23 July 2007 16:24:30 UTC