- From: W. E. Perry <wperry@fiduciary.com>
- Date: Tue, 16 May 2000 11:35:39 -0400
- To: "Clark C. Evans" <cce@clarkevans.com>, xml-uri@w3.org
Hi Clark. I apologize that this was so high-flown. I tried to pitch it to the level of TimBL's posting; sorry I missed. The syntax vs. semantics problem is pervasive in the XML community, is only barely understood, and has littered the specs with nasty booby traps--the "coordination hassle" and the 'empty' URI as a namespace reference which I mentioned in my posting being only two of the simplest and most obvious. XML 1.0 is, except for a very few lapses, a specification of syntax. That, coupled with its original introduction of the radical concept of WF, makes it almost 'intent-free', and particularly so as compared to SGML, bound to the content-model intent expressed in the DTD, and HTML, tangled in the intent of presentation. XML reversed the relationship of model and instance: as syntax, it described the possibilities of the instance; the model could be inferred, if it were necessary to know it at all. Even more remarkably, XML implies an instance processing model: as XML is extensible through markup (and in my unorthodox view, only through markup), an XML processor--something more elaborate and powerful than a parser--need only follow the markup wherever it leads (I have some opinions about that in a slightly dated whitepaper at http://www.uniqueness.net/whitepaper.html). The thing is, this processing model cannot be applied a priori, but only in the instance, at the moment the 'semantics' of a document are derived, for that reader, in that particular context, through the exercise of that reader's own unique capabilities. In an Internet topology of autonomous, largely anonymous peer nodes, those capabilities and that unique instance environment cannot be known, much less presumed, by an outsider transmitting a message. Therefore those transmissions can be properly treated only as messages--the communication of content within a known syntax--not as requests, procedure calls, nor semantic baggage of any sort. Because the enforceable definition of XML is strictly syntactic, these past 18 months of specifications conditioned by semantic intent have given us an ungainly philosophical mismatch. Specs are being written to imply processing and to effect a particular semantic outcome at the node in the instance. Implementing those specs forces too much of the unique extensibility of XML to be curtailed, in the service of a bogus semantic intent. At this point, it is not useful that we confine ourselves to debating--and, I hope, solving--the particular problem of the nature of namespace specifiers. This is one symptom, among many, of the confusion of syntax and semantics in the definition, and the subsequent implementation, of XML specifications. On a project for which I have struggled to shape a workable design over the past several months, I have had to deal with what I believe is the general case of this problem. I have learned that the interoperable part, the content of the message exchanged, the physical expression of the signifier, if you will, must be and must remain in some unalterable form simply text--specified, that is, solely by its syntax. At the same time, the reason for using that signifier is inescapably semantic, and that inevitably semantic identity and content is precisely what TimBL is struggling with . What the XML community has not clearly focussed on yet is when, where and how that semantic content is understood--is, in fact, derived. The answer is that uniquely local semantics are derived at each node in the moment of processing the instance. Those are true semantics in precisely the sense that TimBL understands as requisite building blocks of the semantic web, but such semantics are absolutely local and beyond the power of the agreed syntax of XML 1.0 to communicate. I have had some months to try out the variations of this processing and believe that I have found the only really workable approach. Essentially it is this: the transmitter of a message is free (indeed, should be encouraged) to indicate the semantics he would associate with the content he provides. In practice, that can be done in XML 1.0 syntax only by referencing the process by which he would derive those semantics from that content. If he had no concern for cross-platform interoperability, he might simply indicate an executable which, given the configuration of his local system, and with access to some of the same resources, would process that content to yield the same semantics he understands. To work in a cross-platform world, that process would have to be specified (we assume in XML 1.0 syntax) as algorithmic logic which would derive the expected semantic results from the given content. In all likelihood, that process would also require access to some of the same reference resources on which the transmitter of the message relied for his own semantic understanding of that message content. In other words, the answer to TimBL's question is not a choice: first, because none of the apparent alternatives fully satisfies the (very real!) need for both syntactic and semantic signification and, second, because there is no single choice to be made, but a process which must be performed in each instance, which in each instance will yield a unique result, uniquely suited to the circumstances of that instance. There are other problems in the internetworked world which have had to be solved in just this way--with a process working, perhaps iteratively, on the specifics of the instance rather than with a single definitive choice: DNS, for example. The greater point is that this particular 'coordination hassle' is indeed a showstopper and must in fact be solved now, but it is only one such symptom among many, and the solution we choose should be one generally applicable to the problem. "Clark C. Evans" wrote: > Walter, > > This completely went over my head. Is there an abstract I could request? > > Thanks, > > Clark
Received on Tuesday, 16 May 2000 11:35:46 UTC