- From: Al Gilman <asgilman@iamdigex.net>
- Date: Sat, 17 Jun 2000 14:10:17 -0500
- To: abrahams@acm.org
- Cc: xml-uri@w3.org
At 02:34 PM 2000-06-16 -0400, Paul W. Abrahams wrote: >Al Gilman wrote: > >> **Summary >> >> The problem is both sides are assuming there is a 1:1 relationship and are >> arguing over how to define it. There is no answer in that space. There is >> no 1:1 relationship between namespaces and languages, between Qnames and >> element types. > >Pardon a possibly naive question, but what do you mean by a language in this >context? Do languages have a 1:1 relationship to anything else such as >applications? [The only dumb question is the one you were too embarrassed to ask. I doubt your question is more naive than my answer.] What do I mean by a language. An XML language is an invertible linearization of an ontology used in constructing messages or streams passed as a means of communicating, that follows XML rules in the final stages of tree-to-stream encoding. Languages are to schema-defined graph types as object classes are to structured datatypes. Did I say "a language?" I hope I said that upper layers have to be more aware than the lower levels of "the language." While there may be identified formal things that we call languages, the more important sense of 'language' is not as _an identified thing_ but as _some recognizable stuff_. This is the stuff of the message, it is the territory, relative to namespaces, not the namespace-level map. We can define a set of progressively more detailed maps that tell more and more about the territory. And progressively higher levels of processing that respond to the distinctions in each incremental map refinement. [I speak as a fool. I don't really know what Cowan means by a map:territory 'error.' Actually, I would like to know.] XML has proper methods: string to InfoSet, InfoSet to string. Starting and ending with an Infoset, the pair of transformations is an identity, you get back the same InfoSet. After passing through InfoSet once, you thereafter get back a canonical string. I would expect "a processor" to implement a language class, a group of methods that are proper methods for languages of that class. I would hope that most languages would serve as a means of interoperation among multiple processors. There may be some processors dedicated to a specific language, but I hope there aren't too many languages dedicated to a specific processor. What I am actually trying to establish is not "what, exactly, is a language?" because I don't think an upper limit on that in bottom-up terms is appropriate. I think that the family of XML languages should be continually growing, built bottom-up toward a vision which is not stated in bottom-up terms. I just want to establish that there are valid concerns which may properly be addressed in the upper levels of processing for XML languages, and which will make distinctions between InfoSet nodes in different contexts that are not visible in the Qnames of those nodes. You have to check more language-definition stuff to make sure you are processing them right; so you can't from just the Qname assume that you are going to process them the same to arbitrary levels of altitude in the processing layer stack. One example of this is the two-dimensional structure of tables. One can create a laboratory example language where we just treat tables and not bother with natural language connotations. Here it takes more schema that just what the syntax provides to have the column be a cell-collection in which one can query "What are the header cells that appear before me [the data cell myself] in my column?" Languages we need to build in XML deal with this kind of common knowledge. That is to say knowledge that should be common between different processors of the same language. Another reasonably simple example of a purely relative rule, which applies in the context of the HTML 4.01 + WCAG language is "the color of text should contrast with the color of the background of the text." This doesn't tell you what color the text should be, or what color the background should be. But it does constrain the range of what the tuple {text color, background color} should be. There are better and worse pairs as far as contrast is concerned. This is the kind of rule that would appear in the definition of an "accessible Web document language." Assertions of schema compliance in a document can tell you whether the language used has been checked by the origniator against this rule or not. This would not need to change the namespace used, but it could alter the way the document was processed by the user's processor. The user could have a switch set that said that documents that declare conformance to the color contrast rule would be presented in author colors, but all others would be re-colored per a stylesheet of the user's devising. The point that I am trying to make is that the upper layers of processing may need to make finer type-like distinctions among the nodes in the InfoSets in different contexts than is identified by comparing the Qnames corresponding to their respective element-type-name tokens. Al > >Paul Abrahams >
Received on Saturday, 17 June 2000 13:53:17 UTC