- From: W. E. Perry <wperry@fiduciary.com>
- Date: Wed, 17 May 2000 11:10:10 -0400
- To: Tim Berners-Lee <timbl@w3.org>, xml-uri@w3.org, xml-dev@xml.org
May I respectfully disagree with these conclusions, but in so doing hope to remind the Director of some of his own better ideas: Tim Berners-Lee wrote: > In a distributed system, the semantics must be carried by the message. No. Content is carried by the message, expressed in the agreed syntax. The only case in which a message may be said to carry semantics is where there is agreement (if only implicitly) beforehand that particular syntactic constructs shall be treated as conveying particular (agreed and fixed) semantics. This abuse of agreement over syntax is what I refer to as 'intent'. Intent is predicative, not nominative, and thereby violates the expected neutrality, or disinterest of XML markup in function. The purpose of conveying intent, after all, is to shape the receiving node's local interpretation of a message to a predictable semantic outcome. My objection to SOAP, for example, is that it is premised on conveying precisely this sort of intent. Permitting particular syntactic constructs to be hijacked to convey specific semantics not only disregards the inherent anonymity and autonomy of the receiving node, it greatly curtails the purely syntactic possibilities, and by implication the extensibility itself, of XML. The eXtensible Markup Language is expected to be extensible through markup, not through pre-ordained assignment of syntactic constructs to semantic outcomes. In a world of such pre-ordained vocabularies (abundant examples available in vertical industry markup languages, not to mention SOAP) XML is beggared in being permitted a few syntactic constructs of defined semantic intent, while the infinite remaining possibilities are reduced to NOP's. Realize that the expected or intended semantic resolution of defined syntactic constructs is a particularly pernicious form of presentation insinuated into what should be ontological markup. > True, in most cases today semantics are best defined by what program slurps > it up to the right effect. Yes. A quibble: semantics, thus understood, are not strictly speaking 'defined', but are realized through the operation of a process. > Hence "a quicken input file" defines the semantics of a bank statement file. No. A quicken input (i.e., data) file exhibits a syntactic form expected by the quicken executable and when processed by that executable yields (the quicken program's understanding of) the semantics of a bank statement (the 'file' as a concrete expression of those semantics is functionally otiose). > However, on the internet, the semantics of messages are defined in the > specifications of the languages. No. The implementation of particular language (natural or otherwise) processing at a particular Internet node algorithmically determines the semantic outcome of particular syntactic input against those processes. > They are not arbitrary. Arbitrariness is not a useful description nor an easily measured characteristic of the semantic understanding local to a particular node. Those semantics are certainly idiosyncratic; they may be entirely private; they may or may not be useful, to that node or any other, depending on the availability of further processing capable of dealing with them in their locally-elaborated and locally-expressed form. > The message conveys a meaning between two agents operating on behalf of two > social entities. No. A message is not, nor does it convey, meaning. Meaning is the product of each interpretation of the message content. I must insist on this point: you cannot simply wish it away. It is why we must write processing code to perform that interpretation. Let us be very clear about what we are trying to do. Does anyone seriously believe that functional code is, or will soon become, unnecessary for processing the instance data of each XML document or message? If not, how can so many apparently accept that we will somehow come to wield markup so well that it will of itself provide in every conceivable instance both the semantic outcome which the sender intends as well as that which is specific to the unique and private capabilities and environment of the receiver? That result is possible only if we curtail the syntactic possibilities to those few to which we have assigned or mapped specific semantic outcomes beforehand. > The semantics of HTML tags are not defined in a mathematical way but the > semantics of a bank transfer are. The semantics of HTML 'tags' are realized in each instance through local processing by the browser on the receiving node. That processing is algorithmic, which presumably qualifies as 'in a mathematical way'. Whether my reading of XML prevails in the end, or whether the premises of SOAP and the vertical industry data vocabularies do, there will also be algorithmic processing performed locally on each receiving node against each XML instance. The question is whether that process simply elaborates each syntactic structure encountered into an instance of its pre-assigned semantics, or whether that processing is truly local to the capabilities of the receiving node, and its unique understanding of the instance data, as well as to the unique content supplied in the instance. > In the future, we will be able to define the semantics of a new language by > relating it to things like quicken input files, > and also by specifying mathematical properties of the protocols - such as the > relationship between a check and a bank statement. In the meantime, we still > use English in specifications. But the crucial thing is to recognize that > the namespace identifier identifies the language of the message and so > indirectly its meaning. The namespace identifier has to be the hook onto > which I can hang semantic information. I don't see any other philosophical > basis for XML messages having any meaning. I don't see how any alternative > world would work, how you would prove that anyone owes you money or that the > weather in Amsterdam is rainy. This is the crux. There is no single relationship between a check and a bank statement (I have been writing code to process both since 1982). There are only relationships of an instance of one to an instance of the other, within the understanding of the entity which processes that relationship at a given moment. Each step of any such process must directly model the role--the function--of whoever or whatever is doing the processing. As the account holder, you see a different relationship between instance of check and statement than the debit-processing procedure at your bank does; than the clearing house does; or than the loan officer considering your transactional history with the bank does. None of these viewpoints is canonical: they are all instance interpretations of the instance correspondence between instances of checks and instances of statements. From these instance relationships can we infer a class, to define and code appropriate processing in each case? Yes, but we must predicate that processing on the role, the capabilities and the viewpoint of the node which is to perform it. If you are not a checking debit processing routine, what business do you have telling a node which is one how to perform its own unique job--expressing, that is, a processing intent to that node? And if you are a checking debit processing routine, why are you handing off this work instead of doing it yourself? This is the essence of a distributed system. It can be harnessed into a pipeline of processing, but only by allowing each node to perform its own particular task in its own unique way. At each node the outcome of each such task reflects the semantic understanding of that particular node--what other understanding could it reflect? There will be other nodes--some known to that node and some not--which will want to take the output of that node's process and perform further processing upon it--attach themselves, that is, to a pipeline of processing at that point. That pipeline of process is forked by the action of new nodes attaching themselves and consuming completed work. It cannot be the responsibility of the node whose task is completed to restate and present that work product in the form each of those nodes would find best suited for its own unique processes. In the first place, the prior node may not know which nodes are consuming its product, nor for what purpose, In the second place, the prior node cannot--because it is not uniquely specialized in the tasks of the later nodes--know which of its output they might require or in what form. Generally, those later nodes will require other data to be combined into their processes, from source which the prior know likely knows nothing of. The point is that this pipeline of process *is* the Semantic Web. The semantics are local and presumed unique to each node. The web is the exchange of message in a syntactically agreed form. Extensibility is unlimited and achieved through new constructions of the accepted syntax. Processing is local, where the specific unique expertise for the locally unique understanding of the problem is to be found. Respectfully, Walter Perry
Received on Wednesday, 17 May 2000 11:10:12 UTC