Re: Proposed definitions for content, document object, etc.

At 05:53 PM 2000-04-18 -0400, Ian Jacobs wrote:
>Hello,
>
>As part of resolving issues 207 [1], 211 [2], 226 [3], and 233 [4],
>please consider the following definitions. Compare with the
>Proposed Recommendation [0].
>
[...]

>3) Content
>
>   <BLOCKQUOTE>
>   In this specification, the term "content" refers to the
>   document object. Some content is designed (by specification)
>   for "human consumption". For an HTML document, this includes
>   what appears between the start and end tags of elements, and
>   the values of some attributes (e.g., alt, title, summary).
>   Other content is meant for machines, including the markup
>   itself (e.g,. element and attribute names), some attribute
>   values (e.g., class, id, lang, src), style sheets, scripts,
>   etc.
>   </BLOCKQUOTE>

The definition, here, is fine.  The commentary is dangerous.  Some
attributes usually are consumed in processing (e.g. stylesheet processing)
and do not get included in the displayed view.  But the choice of which
properties are processed in that way is an artifact of the chosen view.  It
only seems intrinsic because the association with the dominant view is so
strong in what we are used to.  

The show/hide properties of [logical equivalents of DOM node properties]
are given in the HTML specification only as defaults and mostly by custom
(show content of unrecognized markup) rather than by specification.  In the
unified view of markup languages that the WAI should be sticking to,
unstructured content is content and structured content provided as
attribute values is content.  All are the basis for constructing views.  

Compare with the XML InfoSet document.  Since this abstract view hides
syntactic details and parsing, the document applies with very minor changes
to HTML as well as to XML.  In other words, the _content_ is even better
described in the InfoSet and exposed for programmatic access via the
interfaces specified in the DOM.  When the InfoSet document is official, it
will be a friendly amendment on the DOM document as defining the 'content'
of the document.

Even where there is language that reads like "this content is for display,
this for control" in the specifications, this is residual "view chauvinism"
of the dominant GUI view of the content.  All content has to be regarded as
equally the basis for generating transformed views, as in the generation of
transformed views, properties will flow between "display" and "control"
roles.  

The simple example is the shift characters (for numbers and uppercase) in
Braille.  This is control in print, and inline content in Braille.  The
status of all other markup language attributes is similar as one considers
the full range of transformed views within the goal of "documents that
transform gracefully to adapt to user strengths and weaknesses."
>
[...]
>
>6) Source view
>
>   <BLOCKQUOTE>
>   A source view renders all or part of the document
>   object in a way that reveals the document object

>   model. Often, a source view presents the document
>   object using the syntax of the source markup 
>   languages.
>   </BLOCKQUOTE>
>

_Only_ the presentation in the syntax of the markup language used as
transfer encoding should be described by the term "source view."  Any other
use of this term will simply introduce confusion and distance the document
from its readership.

[...]
>
>8) I propose changing the definition of "view" to be:
>
>     The term "view" is used in this document
>     to describe the purpose of a particular rendering  (e.g.,
>     "outline view", "table of contents view", "links view").
>

This is a standard concept from data engineering and needs to be applied in
the sense which is current in that community, because this is the theory
that we need to be applying to presentation diversity.  A view of some
information is a particular presentation of a subset of the information
selected by a view-filter rule.  The concept of views as selection plus
presentation of some larger and more abstract information type or instance
is standard data engineering and needs to be used in the standard sense here.

If reference to the Model/View/Controller literature is too political
because the documents are Java-centric, then go back to some standard data
engineering text (Ullman?).  But don't change definitions on this term.  It
needs to be applied here according to what it means, not something we
redefine it to.

>NOTES:
>
> - The same terms (e.g., "content" appear in other W3C 
>   Recommendations and have different meanings. It's ok to
>   define their meaning in our specification to fit our needs.
>
> - "Content" has been defined so that we don't have to
>   use two terms throughout the document. I need to verify
>   its usage throughout the document.
>
> - We should also define "element" and "attribute" separately, 
>   rather than as part of a definition of "content".
>
> - This definition of content would not change the meaning
>   of checkpoint 2.1. We still need to resolve the scope of 
>   2.1 in issue 207.

If this means that "what information has to show through the UI" is a
separate question from the definition, I fully agree.  The WG has been
unclear on this point. But the document that has been shared with the AC
says "all content" which on the face of it would say "all the above per
[InfoSet] interpretation of the document."

Any retreat from that would constitute a change.

Al

>
> - Ian
>
>[0] http://www.w3.org/TR/2000/PR-UAAG10-20000310/#terms
>[1] http://cmos-eng.rehab.uiuc.edu/ua-issues/issues-linear.html#207
>[2] http://cmos-eng.rehab.uiuc.edu/ua-issues/issues-linear.html#211
>[3] http://cmos-eng.rehab.uiuc.edu/ua-issues/issues-linear.html#226
>[4] http://cmos-eng.rehab.uiuc.edu/ua-issues/issues-linear.html#233
>-- 
>Ian Jacobs (jacobs@w3.org)   http://www.w3.org/People/Jacobs
>Tel:                         +1 831 457-2842
>Cell:                        +1 917 450-8783
> 

Received on Tuesday, 18 April 2000 19:38:34 UTC