Re: abbreviations in canonical HTML (thoughts & concrete suggestions)

Awesome.

Your article is very thorough. It contains examples. It tells a story.
It explains distinctions that others have found difficult to explain and/or 
understand.
Good work.

At 11:07 AM 4/2/2007 -0500, Gregory J. Rosmaita wrote:

>[composition date: 3/31/2007]
>
>since joining the HTML WG, i've been perusing the mail archive
>of, as well as receiving posts from, public-html@w3.org
>
>i read with great interest the brainstorming thread on
>abbreviation markup in canonical HTML which begins at:
>
>[http://lists.w3.org/Archives/Public/public-html/2007JanMar/0119.html]
>
>this is a topic in which i have a deep intrest and one of the
>issues which i have been prodding the WAI PF (Protocols &
>Formats) WG to address for the better part of a year; so, by
>way of:
>
>(1) adding to the previous discussion and
>
>(2) introducing myself, as dan suggested, to the WG by
>sharing one of the ideas that i am interested in working
>on.
>
>what follows are the main discussion points i have raised
>within the Protocols & Formats (PF) working group,
>particularly through the PF mailing list.
>
>NOTE: whilst NOT endorsed in any way by the PF WG as a body,
>there was general agreement that these were the most urgent
>abbreviation issues which need to be addressed in canonical
>HTML slash XHTML.
>
>the issues i attempt to address below are issues for a wide
>variety of users, and have myriad implications, as has been
>repeatedly noted, for internationalization as well as
>accessibility, not to mention general usability.
>
>the main point in any discussion about abbreviation markup
>is that it provides the user with a level of granularity
>that makes documents more accessible to ALL users; the
>user may choose to ignore them, expand them automatically
>or on demand, but no matter how the abbreviation markup
>is ultimately rendered client-side, there are certain
>attributes which not only enhance human understanding, but
>enable machine differentiation between types of
>abbreviations and their individual characteristics,
>regardless of rendering or implementation.  moreover, a
>more robust for/id association mechanism would allow
>authors to reuse expansions by pointing to an initial
>expansion -- or, preferably, a site-wide
>
><link type="application/rdf+xml" rel="expansions.rdf" />
>
>thus making it far easier for authors to implement,
>especially if they can do it once and forget about it,
>until prompted by their ATAG compliant authoring tool
>[note 1] to add an association between text contained
>in an abbreviated element, and the site's global
>expansions list, which - ultimately - will lead to
>their wider use, as has been the case with LINKed
>stylesheets...
>
>---- Begin Proposals -----
>Canonical HTML/XHTML Needs Initialism Elements
>
>+ POINT 1: abbreviations are abbreviations are abbreviations:
>
><abbr title="Street">St.</abbr>
>versus
><abbr title="Saint">St.</abbr>
>
>is the classic example  in english, as is Dr. - the abbreviations for
>both the words Doctor and Drive.
>
>another obvious example is the french abbreviation for mademoiselle,
>which to my ears sounds like "mwlee" when pronounced when using
>a screen-reader that doesn't support natural language switching on
>the fly, or more often, due to the lack of a lang attribute which
>would trigger natural language switching on the fly:
>
><abbr lang="fr" title="Mademoiselle">Mlle.</abbr>
>
>+ CONCLUSION 1: abbreviations are therefore needed in canonical
>HTML/XHTML
>
>
>+ POINT 2: initialisms are initialisms are initialisms:
>
>there is a screaming need for an IABBR element, which would subsume
>the acronym element of HTML 4.x and XHTML 1.x
>
>no matter the rules governing the natural language expression of
>an initialism they can be sub-categorized by the following REQUIRED
>attributes -- additions from those with a wider knowledge of
>non-western european languages, feel free to add to the list:
>
>type="acronym"
>type="initialism"
>type="camelcase-abbr"
>type="alpha-numeric"
>
>(i'm not sure we need "alpha-numeric", but will discuss that at a
>later
>point in this draft)
>
>IABBR would also require an "expressed-as" attribute, for example:
>
>expressed-as="characters" (originally, expressed-as="letters")
>expressed-as="word"
>expressed-as="phrase"
>
>
>+ IABBR EXAMPLES:
>
>IABBR would thus result - in its rudest form - in code such as:
>
><IABBR type="acronym" expressed-as="word"
>title="Visually Impaired Computer Users' Group"
> >VICUG</IABBR>
>
>or
>
><IABBR type="camelcase-abbr" expressed-as="word"
>title="SOund Navigation And Ranging">SONAR</IABBR>
>
>or
>
><IABBR type="camelcase-abbr" title="HyperText Markup Language"
>expressed-as="characters">HTML</abbr>
>
>or
>
><IABBR type="initialism" expressed-as="characters"
>title="National Association for the Advancement of Colored Persons"
> >NAACP</IABBR>
>
>i suppose that W3C would fall under the "camelcase-abbr" typology,
>but am unsure - is there a need for a "alpha-numeric" type, or does
>changing the attribute name "letters" to "characters" cover such
>alpha-numeric initialisms as illustrated by the following example:
>
><IABBR type="alpha-numeric" expressed-as="characters"
>title="World Wide Web Consortium">W3C</IABBR>
>
>or
>
><IABBR type="alpha-numeric" expressed-as="characters"
>title="The Minnesota Mining and Manufacturing Company"
> >3M</IABBR>
>
>but on the other hand, i'm not so sure about such antiquated
>initialisms
>such as WWW - would one want that expressed as letters or as
>reflective
>of the title, World Wide Web?  does this necessitate another value for
>the "expressed-as" attribute, namely, phrase?
>
><IABBR type="initialism" expressed-as="phrase"
>title="World Wide Web">WWW</IABBR>
>
>(open question: is "phrase" a synonym for "title", which is what one
>wants expressed in a case such as WWW, as discussed below; if so,
>why not just use the value "title" for "phrase" when coding the
>"expressed-as" attribute?)
>
>so, in summation, there would be an element IABBR which would include
>all known permutations of what we have, up until now, referred to as
>being subject to the ACRONYM element, which would contain
>REQUIRED attributes, "type", "expressed-as", and "title", to
>semantically
>distinguish the type of initialism being expanded, notated, and slash
>or
>pronounced slash displayed.
>
>
>+ OPEN QUESTIONS
>
>1. originally "expressed" was "pronounced", but there was discussion
>off-line and on the 2 august 2006 telecon that discussed the use of
>adding qname or another analogous, workable solution so as to provide
>REAL robust pronunciation guidance WITHIN the IABBR element,
>and it is expected that i, janina, lisa, dave pawson and others will
>take the lead in contributing to this as-yet-undeveloped aspect of
>the IABBR element;
>
>2. is there a need for type="camelcase" AND type="camelcase-abbr"?
>is SONAR a contraction of words that comprise a new single word
>formed of a camelcased phrase , or merely an abbreviation for
>"SOund Navigation And Ranging"?
>
>+ OPEN ISSUES:
>
>1. building a more robust for/id associations for abbreviation
>elements
>
>no matter what form abbreviation and/or initiallism elements take
>in canonical HTML/XHTML, single or multiple abbreviation markup
>needs a strong and elastic "for" slash "id" binding mechanism for
>reuseability's (and the author's sanity's) sake.
>
>the simplest means of strengthening the ABBR element is to use
>the for/id model to associate repeated instances of an ABBR, by
>marking the first instance with the explicit explation, using the
>title attribute, as well as a unique identifier, provided by the
>id atrrtribute.  subsequent repitions of an ABBR thus defined,
>would allow an author or authoring tool to use the for attribute
>to point at the initial expansion for that ABBR, as in the
>following example:
>
>
><p>
><ABBR id="a1" title="Doctor">Dr.</ABBR> Suess
>wrote children's books.  He lived on Suess
><ABBR id="a2" title="Street">St.</ABBR>, which
>had been renamed in his honor; its previous name
>being <ABBR for="a1">Dr.</ABBR> Doolittle <ABBR
>id="a3" title="Drive">Dr.</ABBR>
></p>
>
><p>
>Suess <ABBR for="a2">St.</ABBR> should not be
>confused with Suess <ABBR for="a3">Dr.</ABBR>,
>formerly <ABBR id="a4" title="Saint">St.</ABBR>
>Patrick's <ABBR id="a5" title="Place">Pl.</ABBR>,
>which is the site of <ABBR for="a4">St.</ABBR>
>Harold's Methodist Church, whose pastor is the
><ABBR title="Reverend" id="a6">Rev.</ABBR>
><ABBR for="a1">Dr.</ABBR> Paul Bunyon, author
>of <CITE>This Pilgrim's Progress</CITE>.
>
><!-- OK, you get the point;
>     by the way, Saint Harold was Saint Patrick's younger brother -->
></p>
>
>
>a similar for/id binding should be part of the IABBR
>element, also, so as to make sense of an article whose
>topic sentence is:
>
>The ADA has released an ADA-compliance recommendation
>for dentists and their patients with AIDS; a recommendation
>that grew out of the work of the AIDS' sub-committee on
>safety.
>
>in which the first instance of ADA equals "The American Dental
>Association", the second, "The Americans with Disabilities Act";
>whilst the first instance of AIDS expands to "Acquired
>Immunodeficiency Syndrome" (or, if you prefer, "Acquired
>immune deficiency syndrome"), whilst the second use of the
>initialism AIDS was to represent the "Association of Independent
>Dental Surgeons"
>
>
>through a robust and elastic definition of the for/id mechanism
>to provide bindings between the abbreviated text and its gloss, an
>expansion associated with a particular abbreviation can not only
>be reused, but provide a means of clarification slash
>differentiation in the case of homonymic (identically spelt or
>pronounced) abbreviations.  it would also facilitate a site-wide
>means of associating unique abbreviations with their expansion,
>building upon the example of using LINK to point to an RDF
>assertion document, containing explicit bindings between
>expansions and the abbreviations for which they stand, thereby
>allowing an author to define an abbreviation once and reuse the
>content of the for attribute to provide expansions which could
>then be easily applied site-wide.  and since the assumption seems
>to be that the ideal model is to provide authors with a way of
>constructing semantically sensible markup to contain their
>content, it would translate into a simple interface in an authoring
>tool - every time ABBR is invoked for a string of text, the author
>could be prompted to reuse a previously defined expansion, or
>provide a unique exansion, which would then be appended to the
>site-wide expansion resource.
>
>gregory.
>
>Notes:
>[note 1] for more about the Authoring Tool Accessibility Guidelines,
>consult:
>   * ATAG 1.0 http://www.w3.org/TR/ATAG10
>   * ATAG 2.0 (Working Draft) http://www.w3.org/TR/ATAG20
>
>---------------------------------------------------------------------
>A conclusion is simply the place where someone got tired of thinking.
>                                                       -- Arthur Bloc
>---------------------------------------------------------------------
>Gregory J. Rosmaita - Gregory.Rosmaita@gmail.com
>        Camera Obscura: http://www.hicom.net/~oedipus/
>---------------------------------------------------------------------

Received on Monday, 2 April 2007 17:45:39 UTC