Last Call comment Re: [XML 1.1] characters allowable in names from Al Gilman on 2002-07-08 (www-xml-blueberry-comments@w3.org from July 2002)

From: Al Gilman <asgilman@iamdigex.net>
Date: Sun, 07 Jul 2002 20:56:01 -0400
To: www-xml-blueberry-comments@w3.org
Cc: wai-liaison@w3.org
Message-Id: <5.1.0.14.2.20020707203930.020df850@pop.iamdigex.net>

The Protocols and Formats Working Group <http://www.w3.org/WAI/PF/> would like
to offer comment on the matter of what characters may be used in name tokens
in XML 1.1.

With humble apologies for being late.

</notesInTransmittal>

** name characters in XML 1.1 and access to content creation for people with disabilities **

** summary

[expansions of these three points appear in a following 'details' section]

* The use of arbitrary Unicode characters as name characters in XML is quite likely to impose
serious hardships on people with visual disabilities wishing to create document instances
and application-specific dialects in XML 1.1. Pardon the double negative, but one of
our conclusions on reviewing this point is that it is not a non-issue. Call this "odd
characters may make inaccessible names"

* At the level of creating a markup vocabulary or 'dialect' of XML, a good practice to follow
would be to adopt some actual natural language as the base for symbol creation, and as
symbols in the vocabulary either actual real words from that language or plausible
abbreviations and agglutenations from the natural vocabulary of that language. Call this
"good-symbols BCP at dialect level"

* The discussion was inconclusive, however, as to what if any _character level constraints_
were appropriate to apply against _name characters_, globally in all XML 1.1. Call this
"no clear cut."

** discussion of details

* Odd characters may make inaccessible names:

The use case is for a person who is blind or has seriously low vision to be able to edit
XML document instances, DTDs and schemas. The presumed level of automation is an editor
which internalizes the rules of well-formed XML and is otherwise transparent to the text.
So symbols in the XML used as names of element types and attribute types come through the
XML-recognition of the editor as verbatim sequences of characters from the document
caracter set. For these symbols to function as symbols in the editing of document instances
they should be speak-able as words in the ideal for the speech-output user, and transliteratable into braille characters for the braille-output user.

There is a fall-back to 'spell' mode in the text-to-speech but this is significantly
more tedious for symbols that are long enough and could become a ability-to-do-job
make-or-break consideration. There will be some of each in our working model of the
users for this "walkthough test case." While some very popular XML dialects such as
XHTML Basic will have editors available with higher levels of recognition built in,
XML as a language-building technology is not whole unless this level of editing is
available. In the life cycle of every dialect it will be needed in some stage of
the workflow, and this stage of the workflow should be open to participation
by people with these visual-ability conditions.

People with disabilities tend to be operating off a technology base that is one generation behind. People with two disabilities are often two generations behind.

Note that the standard practice in Braille transcription is to 'bleep' out un-transliteratable
characters with some wild card expression such as [***]. All bleeps so generated will
appear the same, so the distinction among symbols may be totally lost. Without adding
a schema-aware editor that is substiting from schema annotations on the fly. It is not
clear that this extra level of investment in editor internals is a reasonable expectation
for editors for such a small market.

* Good-symbols BCP at dialect level:

Dialects which use real words and abbreviations or agglutenations of real words will
transform gracefully under text-to-speech and Braille transliteration.

It might appear good to assert this guideline at the level of standards or guidelines
for dialect definition, roughly at the level of the XML Accessibility Guidelines.

However, the status of such guidelines in the W3C opus is unclear. The XAG may be put on
a Recommendation Track in the re-chartering of the PF Working Group, but this may not
be assumed.

So things that are properly done at the lower level in XML itself with the 1.1 revision
*should be done there* and not wait for a "maybe we will publish something at a higher
level.

* No clear cut:

In some applications, mathematical symbols and music notes may indeed be apt mnemonics
for element types, where the element semantics lines up with a single character symbol
or frequently encountered cluster such as B-flat. In this sense the definition of a
naturally occurring vocabulary has to be regarded as an extant domain of discourse and
not strictly a language which is natural in the sense that it is the first language
of some speaker group.

Control characters would seem to be pretty bad from a broad base of applications.

But we can't necessarily eliminate all punctuation. If we did we could miss the
opportunity to do agglutinations in some languages. Camel Case only works in caseful
languages. IIRC there are languages where word boundaries require explicit word-break
characters, and these would be required to use that language as a base and a phrase as
a symbol for an element or attribute type.

There are natural languages that don't have natural orthographic spellings. These
include sign languages and spoken languages for which writing is not common among the
speaker group. These are languages for which the WAI seeks equal access but in these
cases it does not appear that the well-formed-XML-editor use-case can be achieved.
In this case natural language expressions will _have_ to be associated with the XML
symbols by formalized indirection, such as through the annotation facilities in XSD.

This is a long-winded explanation of why, in our consideration of this issue, no
consensus emerged for any given name-character admissible set.

** Background:

Please use the following references to review the discussion behind this comment:

http://lists.w3.org/Archives/Public/wai-xtech/2002Jun/thread.html#17

http://lists.w3.org/Archives/Public/wai-xtech/2002Jul/subject.html#10

http://www.w3.org/Search/Mail/Public/search?type-index=wai-xtech&index-type=t&keywords=XML+1.1+element+names&search=Search

(Member access only)
http://lists.w3.org/Archives/Member/w3c-wai-pf/2002AprJun/thread.html#251

</comment>

Received on Sunday, 7 July 2002 20:56:04 UTC