Re: Relationships between different XML applications

Sorry this is so long.  

There is a connected thread through accessibility performance,
authoring in text, and tag soup parsing.

But I don't understand the thread well enough to tell the story
of the thread in separate installments, and I don't understand
any of the subtopics well enough to explain them in twenty-one
words or less.  So, absent the time to write a short letter, here
is the long one.

to follow up on what Greg Sherwin said:

> At 11:15 AM 5/30/98 -0400, Al Gilman wrote:
> >                                            The point of
> >accessibility is to get HTML back to being a robust medium which
> >captures what you have to say and not just what you expect your
> >reader to see.
> 
> I concur with your first point, but you're losing me on the relevance of
> the second. 

OK, let me try to explain the connection more clearly.

Early in the Web, people really wrote content much as you would
write a magazine article, with a thread of narrative as the
primary structure, decorated with exhibits, but the exhibits were
clearly organized as appendages of the textual linear story.

This was a medium where access by text-to-speech produced a very
lively rendition of what the author was up to.

Nowadays people compose web pages as you would a magazine
advertisement, as a graphical montage in which there may be a
linear textual story line or there may not.  The difference is
partly a matter of finish (GIFs of key words instead of text) but
it is also a matter of structure and flow.  Sometimes there is no
logical linear flow to the graphical composition.  In a
significant number of other cases the use of WYSIWYG graphic
design modes results in HTML code which fails to capture a linear
structure which is actually there in the message, but requires
visual review of the layout to comprehend.

One of the generalizations that I carry around (working draft
synthesis) from my exposure to what has transpired in the WAI is
that "orientation and navigability" is a priority and that "a
textual sub-web which is sufficient by itself for orientation and
navigation" is a requiremenet.  Orientation means it is easy to
maintain a concept of where you are and where you might go next.
Navigability is how easy and predictable it is to get where you
want.  If you can't reconstruct the logical linear order for
the text, orientation and navigability through an aural display
mode is gone.  Broken beyond use.

An aural display environment will be presenting things
linearly, with hooks for branches such as to pause to describe a
scene, but the flow model is branch options hung on a linear
thread, and the linear thread probably populated with speech.

We need at the foundation a view of the markup as forming a mesh
that can be flowed out in either audio or screen-io.  It's a web,
not a tree.  The rules we impose on markup to make it easy to
totally replace the linear text with a tree data structure
contribute to making the markup a code which demands marks for
tree convenience which are not needed for the linear discourse
with attributes application.  So the importance, or requirement
to serve, a linear discourse view of the content is directly
related to the tradeoff as to whether interdigitated tags are
something that should be outlawed by the markup language or
should be a requirement on the parser to cope.

If we had a clearer mental picture of the semantic _mesh_ that
the HTML analyzer needs to reconstruct from the transmission
encoding of the document, we might be more ready to agree to make
the parser's job harder and the html-author's job easier.  I
don't mean we can ignore parse difficulty.  I just think that
HTML should consider making more concessions to preserving
hand-writeability than XML has.  Or, as I said, that we come up
with an invertible tag-suppression doctrine.

> Machine-generation based on a set of user-defined rules does
> not necessarily prohibit accessibility. 

This is where we get into discussions of technical risk.  I am
all for growing in the direction of operating on abstractions.
But making the markup language totally private to the tools, it
might as well be a binary data format and there is a risk (well
demonstrated in history) that some small flaw will cause the
alternate modes not to work and there will be no recourse via
"escape to the code level" workarounds.  This is mostly an
argument for incrementalism and a degree of both-and rather than
either/or.

> Under most circumstances, I can take a document with
> publishing-quality layout and export it as raw ASCII if I
> wish. "FTP with pictures," as we've all heard it once
> described.

How can I put it.  I want to examine "under most circumstances."
The present rate at which this fails, in terms of reconstructing
something comprehensible, with regard to exiting on-the-web
literature, was characterized by Tim Berners-Lee at the WAI
Launch as unacceptably bad and getting worse.  If one had tools
that implement the latest W3C recommendations and authors that
implement the WAI guidelines, it is true that one accepts very
few limitations on graphical control to present a document which
is lucid in audio as well.  

> >As far as I know, the best current practice for blind people
> >wishing to create Web pages, database applications, and similar
> >lightweight programming is to use text interfaces.
> 
> Certainly. But the invention of word processing bloatware didn't
> necessarily make ASCII documents written up in vi or emacs obsolete or
> illegible.

It made people unemployable because organizations standardized on
the bloatware and did not open their business processes to
writing in those modes.

I believe we need to aim higher, in terms of openness to
alternative I/O modes, than the standard set by current office
aps, for the WWW.

> >Writeability of HTML is a value.  It contributes to the
> >democracy, the universality of the Web.  If we make HTML not
> >writeable as text, we lose.  You have to have a story how you are
> >going to make that up to sell this.

> I completely agree that writability is a critical
> issue. Without it, the Web would have no where near the
> proliferation it has today. Yet continued support of simplicity
> does not require us to limit ourselves to a Byzantine document
> authoring system for a heterogeneous, networked environment.
> Furthermore, insistance upon such a primitive baseline would
> put the whole standards process in jeopardy of public
> irrelevance.

I am not sure which connotations of Byzantine you mean to invoke.
I suspect it is simply "extremely antiquated."  

I am very concerned about these issues, too.  I am not convinced
that treating authoring in text as a non-concern at this time is
responsible, in the context of an orderly migration to a system
that functions across heterogeneity in the UI modality.

You have probably had the same kind of experience that I have,
being able to repair exchanges of attachments between office
desktop operators, because you had lowlevel access to the message
content including wrappers and encodings.  That is the operating
point we are at with HTML.  People who consider themselves HTML
writers are writing much more accessible pages than those
capitalizing on the training-saving features of the maximally
automated page development tools.  I think we need to tread
carefully as we try to bridge this gap.

> Despite my continued use of mailx in a world enthralled with
> souped-up mail clients that support text/enhanced and other
> nefarious (from my perspective) MIME types, my cries for
> archaic simplicity here will do nothing to stop the public
> migration to richer e-mail authoring formats.  Like it or not,
> to the rest of the e-mail-enabled universe my choice of mail
> clients puts me in the same class as cephalopods. My choice is
> to either adapt or perish in a sea of unintelligible
> attachments.

Not quite.  The application of peer pressure within mailing lists
gives you a little leverage for obtaining modest accomodations in
terms of the character set and wrapping properties that people
use for the base message.  Universal interoperability can be
enhanced if you don't simply insist that people stop using
attachments.

The mailx-friendly technology salient is the minimalist-markup
frontier where object classes such as HTTP urls are treated as
sufficiently distinct from plain text and hence processed into 
action opportunities without waiting for anchor tags.

> Similarly, we have witnessed the growth of document authoring environments
> progress from primordial ASCII to word processing bloatware with
> ransom-note-like font support and layout features most of us don't even
> understand, let alone use.
> 
> To say these truths are a fluke and not representative of the advancement
> of document authoring in general would be naive, to say the least. While
> some of us still follow the TeX gospel according to Knuth, there are grand
> reasons why everyone knows of MS Word, WordPerfect, etc. -- and uttering
> the phrase 'LaTeX' or even 'troff' only draws blank stares from the every
> software sales clerk at the local Office Max.

This is too unidimensional a view of progress.  It is possible to fix
what go broken in the last wave of advance without breaking everything
we gained.

> >One of the writeability problems of XML is that it assumes tags
> >are there to form elements, and that elements nest.  This is an
> >unnatural restriction on verbal expression.
> 
> Case-[in]-point why these rules and regulations of syntax, while powerful
> for rendering and other purposes, should be abstracted from any sane author.

I am not against syntax.  I am against over-reliance on syntax
not balanced against appropriate semantics, and I am against
old-fashioned syntax that hasn't learned the OO lessons of the
GUI revolution.  Document-element classes can beneficially have
more independence or asynchrony than mere tree branches.

The text author's problems with the tree-myopic parse rules are
just a "canary in the mineshaft" hint of the more fundamental
ways that the semantic web is being unwittingly mutilated by the
assertion of language rules solely for the benefit of legacy
parse strategies.

Al

Received on Sunday, 31 May 1998 12:17:56 UTC