Re: classification of Braille [signs] in Unicode from Al Gilman on 2003-09-23 (wai-xtech@w3.org from September 2003)

From: Al Gilman <asgilman@iamdigex.net>
Date: Tue, 23 Sep 2003 11:40:34 -0400
To: wai-xtech@w3.org
Message-Id: <5.1.0.14.2.20030923101854.020811b0@pop.iamdigex.net>
At 10:12 AM 2003-09-23, Al Gilman wrote:

>The following public comment issue in Unicode has come to our attention.

<quote
cite="http://www.unicode.org/review/">

   15 Changing General Category of Braille Patterns to "Letter Other"
   2003.10.27
   The UTC has received requests to change the general category of the
   Braille characters to be "Letter other" (Lo) rather than "Symbol
   other" (So), and is seeking comments and information on the Braille
   processing model and existing implementations to help with this
   decision.

   The Braille pattern symbols are encoded from U+2800 through U+28FF,
   and are discussed in the Unicode Standard 4.0, chapter 14 section
   9. The presumption until now in Unicode has been that the Braille
   characters are essentially "final form" characters; that the source
   text would be in other scripts, and these would be used for
   presentation of that source text. Under that model, the characters
   would be better characterized as symbols; in particular, they would
   not be suitable for program identifiers.

   The effect of the proposed change would be for implementations to
   treat the Braille pattern symbols as letters rather than symbols for
   various textual processes. There is a particular interaction with the
   proposed XML 1.1 categorizations for element names that the
   committee is concerned with, and is especially interested in feedback
   regarding related issues.

</quote>

The relevant provisions in XML 1.1 are:

<quote
cite="http://www.w3.org/TR/xml11/#sec2.3"
aka="Extensible Markup Language (XML) 1.1">


    Change production [4], and add new production [4a]:
  [4]     NameStartChar := ":" | [A-Z] | "_" | [a-z] |
          [#xC0-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |
          [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
          [#x3001-#xD7FF] | [#xF900-#xEFFFF]
  [4a]    NameChar := NameStartChar | "-" | "." | [0-9] | #xB7 |
          [#x0300-#x036F] | [#x203F-#x2040]

</quote>

The 'interaction' as best as I can see it is that if people wish to create
element and attribute names in verbatim Braille they are prevented by
the exclusion of the range [#x2800-#x28FF] from the NameChar production.

Changing the "general category" in the Unicode specification would be scant
comfort if people really wish to use Braille idioms as XML element names
or XML attribute names.  Unless we go back and modify the XML 1.1 specification
between Candidate Recommendation and Proposed Recommendation, the capability
will not be there in XML.

<finding
class="draft ofFact">

1.  Braille signs are generally letters, not symbols, although some are
punctuation and control codes.

Braille  signs include all of letters, punctuation, and controlCharacters.
Most Braille signs are letters.  So from a pure 'accuracy in description'
point of view, it would be most accurate to describe the general category,
if that is to mean the category to which most of the code points in this
range belong, as 'letters' as opposed to 'symbols.'

2.  There are use cases for authoring in literal Braille and not accepting
runtime conversion from standard spelling in the base language.

<quote
cite="">

  The DTBook element set has considerable application outside of the
  digital talking book as well. It was designed to enable the production
  of documents in a variety of accessible formats. At least one U.S.
  Braille translation software package has implemented a facility that
  imports DTBook documents and automatically translates and formats them
  in Grade 2 Braille. It is expected that similar automated processes
  will be developed for converting properly marked-up documents into
  large print and for rendering DTBook documents in Braille, synthetic
  speech, and large print "on the fly." Finally, an attribute called
  "showin" is incorporated in the DTBook element set to control the
  display of selected segments of a DTBook document. For example,
  descriptions of a graph might vary between Braille and large print
  editions; "showin" could allow only the appropriate version to show in
  each edition, although both would be present in the DTBook document.

<quote>

See also:

  http://www.loc.gov/nls/z3986/v100/dtbook110.dtd
  http://www.loc.gov/nls/z3986/v100/index.html

3.  There are use cases for tokens that are idiomatic in common Braille use
as program names.  One example is a Jaws script that is used by deaf-blind
individuals to accelerate access to a text chat application.  This is a
program that is created by a Braille user for Braille users and strict
social equality would suggest that an idiomatic mnemonic would be
appropriate, and the appropriate context in which to assess whether the
identifier is mnemonic is as presented in Braille.

4.  The damage to XML access and usability incurred by Braille users as a
result of excluding Braille signs is slight.

Braille is an alternate script for languages that have an oral life and lots
of Braille-illiterage users.   Most Interactive use of Braille on the
computer is in uncompressed Braille, in which the letters are carried over
one for one from the standard spelling as used in print.  So "tokens that
are mnemonic in the base language" offers the Braille-using person reviewing
or editing XML or computer codes a rich vocabulary of mnemonic tokens, and
the size of the available set of good mnemonic tokens is not contracted much
at all by excluding the use of literal braille as programmatic identifiers
or XML element and attribute names.

To imagine a person with a disability who is cut off from functional access
to programmatic identifiers or XML element and attribute names, they would
perhaps have to be all of deaf, blind, and semantic-pragmatic.  There will
plausibly be a measurable increase in human error in interpreting the tokens
if barred from inserting Braille idioms, but this will be small because the
user of the code is in constant practice interpreting the un-contracted 
spelling
of the base language through the rest of their computer-using life.

5.  Note that the [no-status draft] XML Accessibility Guidelines caution 
against
relying on the mnemonic qualities of XML element and attribute names:

<quote
cite="http://www.w3.org/TR/xag#cp4_9">

       4.9 Do not assume that element or attribute names provide any
               information about element semantics.

</quote>

It is at least the consensus of those who have participated thus far in 
preparing
that draft that machine-processable associations between [element and 
attribute
names] and very-human-understandable descriptions is a better strategy than 
trying
to make the tokens themselves standalone mnemonics.

http://www.w3.org/TR

</finding>


>Please see also the earlier discussion on the XML 1.1 considerations at
>  http://lists.w3.org/Archives/Public/wai-xtech/2002Jun/thread.html#17
>
>
>>Date: Mon, 22 Sep 2003 17:44:23 -0400
>>From: Martin Duerst <duerst@w3.org>
>>Mime-Version: 1.0
>>Subject: Fwd: New Public Review Issues posted
>
>The Unicode Technical Committee has sent the I18N WG the
>following review request. I think the Braille issue is of
>interest to WAI. Please send comments back to the I18N IG
>or directly use the report form on the Unicode Web site.
>Also, please feel free to forward as appropriate.
>(the review is public)
>
>Regards,    Martin.
>
>>Subject: New Public Review Issues posted
>>Date: Fri, 19 Sep 2003 16:38:44 -0700
>>From: Rick McGowan <rick@unicode.org>
>
>>The Unicode Technical Committee has posted several new issues for public
>>review and comment. Details are on the following web page:
>>
>>         http://www.unicode.org/review/
>>
>>Review periods for the new items close on October 27, 2003.
>>
>>Please visit the page for links to discussion and relevant documents.
>>
>>One new issue is #15 Changing General Category of Braille Pattern Symbols
>>to "Letter Other". The UTC has received requests to change the general
>>category of the Braille characters to be "Letter other" (Lo) rather than
>>"Symbol other" (So), and is seeking comments and information on the Braille
>>processing model and existing implementations to help with this decision.
>>Please refer to the review page for the rest of the details on this issue.
>>
>>Five other new issues are also included for proposed updates and proposed
>>drafts of Unicode Technical Reports, Technical Standards, and Unicode
>>Annexes. They are:
>>
>>#16   Update to UAX #29 Text Boundaries
>>#17   UTS #18 Unicode Regular Expressions (change from UTR to UTS)
>>#18   Draft UTR #23 The Unicode Character Property Model
>>#19   Proposed Draft UTR #30 Character Foldings
>>#20   Proposed Draft UTR #31 Identifier and Pattern Syntax
>>
>>If you have comments for official UTC consideration, please post them by
>>submitting your comments through our feedback & reporting page:
>>
>>     http://www.unicode.org/reporting.html
>>
>>If you wish to discuss issues on the Unicode mail list, then please
>>use the following link to subscribe (if necessary). Please be aware
>>that discussion comments on the Unicode mail list are not automatically
>>recorded as input to the UTC. You must use the reporting link above
>>to generate comments for UTC consideration.
>>
>>     http://www.unicode.org/consortium/distlist.html
>>
>>Regards,
>>         Rick McGowan
>>         Unicode, Inc.
>
Received on Tuesday, 23 September 2003 11:45:37 UTC