- From: Gregory J. Rosmaita <oedipus@hicom.net>
- Date: Mon, 2 Apr 2007 11:07:09 -0500
- To: public-html@w3.org
[composition date: 3/31/2007] since joining the HTML WG, i've been perusing the mail archive of, as well as receiving posts from, public-html@w3.org i read with great interest the brainstorming thread on abbreviation markup in canonical HTML which begins at: [http://lists.w3.org/Archives/Public/public-html/2007JanMar/0119.html] this is a topic in which i have a deep intrest and one of the issues which i have been prodding the WAI PF (Protocols & Formats) WG to address for the better part of a year; so, by way of: (1) adding to the previous discussion and (2) introducing myself, as dan suggested, to the WG by sharing one of the ideas that i am interested in working on. what follows are the main discussion points i have raised within the Protocols & Formats (PF) working group, particularly through the PF mailing list. NOTE: whilst NOT endorsed in any way by the PF WG as a body, there was general agreement that these were the most urgent abbreviation issues which need to be addressed in canonical HTML slash XHTML. the issues i attempt to address below are issues for a wide variety of users, and have myriad implications, as has been repeatedly noted, for internationalization as well as accessibility, not to mention general usability. the main point in any discussion about abbreviation markup is that it provides the user with a level of granularity that makes documents more accessible to ALL users; the user may choose to ignore them, expand them automatically or on demand, but no matter how the abbreviation markup is ultimately rendered client-side, there are certain attributes which not only enhance human understanding, but enable machine differentiation between types of abbreviations and their individual characteristics, regardless of rendering or implementation. moreover, a more robust for/id association mechanism would allow authors to reuse expansions by pointing to an initial expansion -- or, preferably, a site-wide <link type="application/rdf+xml" rel="expansions.rdf" /> thus making it far easier for authors to implement, especially if they can do it once and forget about it, until prompted by their ATAG compliant authoring tool [note 1] to add an association between text contained in an abbreviated element, and the site's global expansions list, which - ultimately - will lead to their wider use, as has been the case with LINKed stylesheets... ---- Begin Proposals ----- Canonical HTML/XHTML Needs Initialism Elements + POINT 1: abbreviations are abbreviations are abbreviations: <abbr title="Street">St.</abbr> versus <abbr title="Saint">St.</abbr> is the classic example in english, as is Dr. - the abbreviations for both the words Doctor and Drive. another obvious example is the french abbreviation for mademoiselle, which to my ears sounds like "mwlee" when pronounced when using a screen-reader that doesn't support natural language switching on the fly, or more often, due to the lack of a lang attribute which would trigger natural language switching on the fly: <abbr lang="fr" title="Mademoiselle">Mlle.</abbr> + CONCLUSION 1: abbreviations are therefore needed in canonical HTML/XHTML + POINT 2: initialisms are initialisms are initialisms: there is a screaming need for an IABBR element, which would subsume the acronym element of HTML 4.x and XHTML 1.x no matter the rules governing the natural language expression of an initialism they can be sub-categorized by the following REQUIRED attributes -- additions from those with a wider knowledge of non-western european languages, feel free to add to the list: type="acronym" type="initialism" type="camelcase-abbr" type="alpha-numeric" (i'm not sure we need "alpha-numeric", but will discuss that at a later point in this draft) IABBR would also require an "expressed-as" attribute, for example: expressed-as="characters" (originally, expressed-as="letters") expressed-as="word" expressed-as="phrase" + IABBR EXAMPLES: IABBR would thus result - in its rudest form - in code such as: <IABBR type="acronym" expressed-as="word" title="Visually Impaired Computer Users' Group" >VICUG</IABBR> or <IABBR type="camelcase-abbr" expressed-as="word" title="SOund Navigation And Ranging">SONAR</IABBR> or <IABBR type="camelcase-abbr" title="HyperText Markup Language" expressed-as="characters">HTML</abbr> or <IABBR type="initialism" expressed-as="characters" title="National Association for the Advancement of Colored Persons" >NAACP</IABBR> i suppose that W3C would fall under the "camelcase-abbr" typology, but am unsure - is there a need for a "alpha-numeric" type, or does changing the attribute name "letters" to "characters" cover such alpha-numeric initialisms as illustrated by the following example: <IABBR type="alpha-numeric" expressed-as="characters" title="World Wide Web Consortium">W3C</IABBR> or <IABBR type="alpha-numeric" expressed-as="characters" title="The Minnesota Mining and Manufacturing Company" >3M</IABBR> but on the other hand, i'm not so sure about such antiquated initialisms such as WWW - would one want that expressed as letters or as reflective of the title, World Wide Web? does this necessitate another value for the "expressed-as" attribute, namely, phrase? <IABBR type="initialism" expressed-as="phrase" title="World Wide Web">WWW</IABBR> (open question: is "phrase" a synonym for "title", which is what one wants expressed in a case such as WWW, as discussed below; if so, why not just use the value "title" for "phrase" when coding the "expressed-as" attribute?) so, in summation, there would be an element IABBR which would include all known permutations of what we have, up until now, referred to as being subject to the ACRONYM element, which would contain REQUIRED attributes, "type", "expressed-as", and "title", to semantically distinguish the type of initialism being expanded, notated, and slash or pronounced slash displayed. + OPEN QUESTIONS 1. originally "expressed" was "pronounced", but there was discussion off-line and on the 2 august 2006 telecon that discussed the use of adding qname or another analogous, workable solution so as to provide REAL robust pronunciation guidance WITHIN the IABBR element, and it is expected that i, janina, lisa, dave pawson and others will take the lead in contributing to this as-yet-undeveloped aspect of the IABBR element; 2. is there a need for type="camelcase" AND type="camelcase-abbr"? is SONAR a contraction of words that comprise a new single word formed of a camelcased phrase , or merely an abbreviation for "SOund Navigation And Ranging"? + OPEN ISSUES: 1. building a more robust for/id associations for abbreviation elements no matter what form abbreviation and/or initiallism elements take in canonical HTML/XHTML, single or multiple abbreviation markup needs a strong and elastic "for" slash "id" binding mechanism for reuseability's (and the author's sanity's) sake. the simplest means of strengthening the ABBR element is to use the for/id model to associate repeated instances of an ABBR, by marking the first instance with the explicit explation, using the title attribute, as well as a unique identifier, provided by the id atrrtribute. subsequent repitions of an ABBR thus defined, would allow an author or authoring tool to use the for attribute to point at the initial expansion for that ABBR, as in the following example: <p> <ABBR id="a1" title="Doctor">Dr.</ABBR> Suess wrote children's books. He lived on Suess <ABBR id="a2" title="Street">St.</ABBR>, which had been renamed in his honor; its previous name being <ABBR for="a1">Dr.</ABBR> Doolittle <ABBR id="a3" title="Drive">Dr.</ABBR> </p> <p> Suess <ABBR for="a2">St.</ABBR> should not be confused with Suess <ABBR for="a3">Dr.</ABBR>, formerly <ABBR id="a4" title="Saint">St.</ABBR> Patrick's <ABBR id="a5" title="Place">Pl.</ABBR>, which is the site of <ABBR for="a4">St.</ABBR> Harold's Methodist Church, whose pastor is the <ABBR title="Reverend" id="a6">Rev.</ABBR> <ABBR for="a1">Dr.</ABBR> Paul Bunyon, author of <CITE>This Pilgrim's Progress</CITE>. <!-- OK, you get the point; by the way, Saint Harold was Saint Patrick's younger brother --> </p> a similar for/id binding should be part of the IABBR element, also, so as to make sense of an article whose topic sentence is: The ADA has released an ADA-compliance recommendation for dentists and their patients with AIDS; a recommendation that grew out of the work of the AIDS' sub-committee on safety. in which the first instance of ADA equals "The American Dental Association", the second, "The Americans with Disabilities Act"; whilst the first instance of AIDS expands to "Acquired Immunodeficiency Syndrome" (or, if you prefer, "Acquired immune deficiency syndrome"), whilst the second use of the initialism AIDS was to represent the "Association of Independent Dental Surgeons" through a robust and elastic definition of the for/id mechanism to provide bindings between the abbreviated text and its gloss, an expansion associated with a particular abbreviation can not only be reused, but provide a means of clarification slash differentiation in the case of homonymic (identically spelt or pronounced) abbreviations. it would also facilitate a site-wide means of associating unique abbreviations with their expansion, building upon the example of using LINK to point to an RDF assertion document, containing explicit bindings between expansions and the abbreviations for which they stand, thereby allowing an author to define an abbreviation once and reuse the content of the for attribute to provide expansions which could then be easily applied site-wide. and since the assumption seems to be that the ideal model is to provide authors with a way of constructing semantically sensible markup to contain their content, it would translate into a simple interface in an authoring tool - every time ABBR is invoked for a string of text, the author could be prompted to reuse a previously defined expansion, or provide a unique exansion, which would then be appended to the site-wide expansion resource. gregory. Notes: [note 1] for more about the Authoring Tool Accessibility Guidelines, consult: * ATAG 1.0 http://www.w3.org/TR/ATAG10 * ATAG 2.0 (Working Draft) http://www.w3.org/TR/ATAG20 --------------------------------------------------------------------- A conclusion is simply the place where someone got tired of thinking. -- Arthur Bloc --------------------------------------------------------------------- Gregory J. Rosmaita - Gregory.Rosmaita@gmail.com Camera Obscura: http://www.hicom.net/~oedipus/ ---------------------------------------------------------------------
Received on Monday, 2 April 2007 16:07:53 UTC