- From: Al Gilman <asgilman@iamdigex.net>
- Date: Sat, 17 Jun 2000 14:10:17 -0500
- To: abrahams@acm.org
- Cc: xml-uri@w3.org
At 02:34 PM 2000-06-16 -0400, Paul W. Abrahams wrote:
>Al Gilman wrote:
>
>> **Summary
>>
>> The problem is both sides are assuming there is a 1:1 relationship and are
>> arguing over how to define it.  There is no answer in that space.  There is
>> no 1:1 relationship between namespaces and languages, between Qnames and
>> element types.
>
>Pardon a possibly naive question, but what do you mean by a language in this
>context?  Do languages have a 1:1 relationship to anything else such as
>applications?
[The only dumb question is the one you were too embarrassed to ask.  I
doubt your question is more naive than my answer.]
What do I mean by a language.
An XML language is an invertible linearization of an ontology used in
constructing messages or streams passed as a means of communicating, that
follows XML rules in the final stages of tree-to-stream encoding.
Languages are to schema-defined graph types as object classes are to
structured datatypes.
Did I say "a language?"  I hope I said that upper layers have to be more
aware than the lower levels of "the language."  While there may be
identified formal things that we call languages, the more important sense
of 'language' is not as _an identified thing_ but as _some recognizable
stuff_.  This is the stuff of the message, it is the territory, relative to
namespaces, not the namespace-level map.  We can define a set of
progressively more detailed maps that tell more and more about the
territory.  And progressively higher levels of processing that respond to
the distinctions in each incremental map refinement.  [I speak as a fool.
I don't really know what Cowan means by a map:territory 'error.'  Actually,
I would like to know.]
XML has proper methods: string to InfoSet, InfoSet to string.  Starting and
ending with an Infoset, the pair of transformations is an identity, you get
back the same InfoSet.  After passing through InfoSet once, you thereafter
get back a canonical string.
I would expect "a processor" to implement a language class, a group of
methods that are proper methods for languages of that class.
I would hope that most languages would serve as a means of interoperation
among multiple processors.  There may be some processors dedicated to a
specific language, but I hope there aren't too many languages dedicated to
a specific processor.
What I am actually trying to establish is not "what, exactly, is a
language?" because I don't think an upper limit on that in bottom-up terms
is appropriate.  I think that the family of XML languages should be
continually growing, built bottom-up toward a vision which is not stated in
bottom-up terms.
I just want to establish that there are valid concerns which may properly
be addressed in the upper levels of processing for XML languages, and which
will make distinctions between InfoSet nodes in different contexts that are
not visible in the Qnames of those nodes.  You have to check more
language-definition stuff to make sure you are processing them right; so
you can't from just the Qname assume that you are going to process them the
same to arbitrary levels of altitude in the processing layer stack.
One example of this is the two-dimensional structure of tables.  One can
create a laboratory example language where we just treat tables and not
bother with natural language connotations.  Here it takes more schema that
just what the syntax provides to have the column be a cell-collection in
which one can query "What are the header cells that appear before me [the
data cell myself] in my column?"  Languages we need to build in XML deal
with this kind of common knowledge.  That is to say knowledge that should
be common between different processors of the same language.
Another reasonably simple example of a purely relative rule, which applies
in the context of the HTML 4.01 + WCAG language is "the color of text
should contrast with the color of the background of the text."  This
doesn't tell you what color the text should be, or what color the
background should be.  But it does constrain the range of what the tuple
{text color, background color} should be.  There are better and worse pairs
as far as contrast is concerned.  This is the kind of rule that would
appear in the definition of an "accessible Web document language."
Assertions of schema compliance in a document can tell you whether the
language used has been checked by the origniator against this rule or not.
This would not need to change the namespace used, but it could alter the
way the document was processed by the user's processor.  The user could
have a switch set that said that documents that declare conformance to the
color contrast rule would be presented in author colors, but all others
would be re-colored per a stylesheet of the user's devising.
The point that I am trying to make is that the upper layers of processing
may need to make finer type-like distinctions among the nodes in the
InfoSets in different contexts than is identified by comparing the Qnames
corresponding to their respective element-type-name tokens.
Al
>
>Paul Abrahams
> 
Received on Saturday, 17 June 2000 13:53:17 UTC