W3C home > Mailing lists > Public > xml-uri@w3.org > May 2000

Re: Welcome to the XML-URI list

From: W. E. Perry <wperry@fiduciary.com>
Date: Tue, 16 May 2000 11:35:39 -0400
Message-ID: <39216ACB.99542D72@fiduciary.com>
To: "Clark C. Evans" <cce@clarkevans.com>, xml-uri@w3.org
Hi Clark.

I apologize that this was so high-flown. I tried to pitch it to the level of TimBL's
posting; sorry I missed.

The syntax vs. semantics problem is pervasive in the XML community, is only barely
understood, and has littered the specs with nasty booby traps--the "coordination
hassle" and the 'empty' URI as a namespace reference which I mentioned in my posting
being only two of the simplest and most obvious.

XML 1.0 is, except for a very few lapses, a specification of syntax. That, coupled
with its original introduction of the radical concept of WF, makes it almost
'intent-free', and particularly so as compared to SGML, bound to the content-model
intent expressed in the DTD, and HTML, tangled in the intent of presentation. XML
reversed the relationship of model and instance:  as syntax, it described the
possibilities of the instance; the model could be inferred, if it were necessary to
know it at all. Even more remarkably, XML implies an instance processing model:  as
XML is extensible through markup (and in my unorthodox view, only through markup), an
XML processor--something more elaborate and powerful than a parser--need only follow
the markup wherever it leads (I have some opinions about that in a slightly dated
whitepaper at http://www.uniqueness.net/whitepaper.html). The thing is, this
processing model cannot be applied a priori, but only in the instance, at the moment
the 'semantics' of a document are derived, for that reader, in that particular
context, through the exercise of that reader's own unique capabilities. In an Internet
topology of autonomous, largely anonymous peer nodes, those capabilities and that
unique instance environment cannot be known, much less presumed, by an outsider
transmitting a message. Therefore those transmissions can be properly treated only as
messages--the communication of content within a known syntax--not as requests,
procedure calls, nor semantic baggage of any sort.

Because the enforceable definition of XML is strictly syntactic, these past 18 months
of specifications conditioned by semantic intent have given us an ungainly
philosophical mismatch. Specs are being written to imply processing and to effect a
particular semantic outcome at the node in the instance. Implementing those specs
forces too much of the unique extensibility of XML to be curtailed, in the service of
a bogus semantic intent. At this point, it is not useful that we confine ourselves to
debating--and, I hope, solving--the particular problem of the nature of namespace
specifiers. This is one symptom, among many, of the confusion of syntax and semantics
in the definition, and the subsequent implementation, of XML specifications.

On a project for which I have struggled to shape a workable design over the past
several months, I have had to deal with what I believe is the general case of this
problem. I have learned that the interoperable part, the content of the message
exchanged, the physical expression of the signifier, if you will, must be and must
remain in some unalterable form simply text--specified, that is, solely by its syntax.
At the same time, the reason for using that signifier is inescapably semantic, and
that inevitably semantic identity and content is precisely what TimBL is struggling
with . What the XML community has not clearly focussed on yet is when, where and how
that semantic content is understood--is, in fact, derived. The answer is that uniquely
local semantics are derived at each node in the moment of processing the instance.
Those are true semantics in precisely the sense that TimBL understands as requisite
building blocks of the semantic web, but such semantics are absolutely local and
beyond the power of the agreed syntax of XML 1.0 to communicate.

I have had some months to try out the variations of this processing and believe that I
have found the only really workable approach. Essentially it is this:  the transmitter
of a message is free (indeed, should be encouraged) to indicate the semantics he would
associate with the content he provides. In practice, that can be done in XML 1.0
syntax only by referencing the process by which he would derive those semantics from
that content. If he had no concern for cross-platform interoperability, he might
simply indicate an executable which, given the configuration of his local system, and
with access to some of the same resources, would process that content to yield the
same semantics he understands. To work in a cross-platform world, that process would
have to be specified (we assume in XML 1.0 syntax) as algorithmic logic which would
derive the expected semantic results from the given content. In all likelihood, that
process would also require access to some of the same reference resources on which the
transmitter of the message relied for his own semantic understanding of that message

In other words, the answer to TimBL's question is not a choice:  first, because none
of the apparent alternatives fully satisfies the (very real!) need for both syntactic
and semantic signification and, second, because there is no single choice to be made,
but a process which must be performed in each instance, which in each instance will
yield a unique result, uniquely suited to the circumstances of that instance. There
are other problems in the internetworked world which have had to be solved in just
this way--with a process working, perhaps iteratively, on the specifics of the
instance rather than with a single definitive choice:  DNS, for example. The greater
point is that this particular 'coordination hassle' is indeed a showstopper and must
in fact be solved now, but it is only one such symptom among many, and the solution we
choose should be one generally applicable to the problem.

"Clark C. Evans" wrote:

> Walter,
> This completely went over my head.  Is there an abstract I could request?
> Thanks,
> Clark
Received on Tuesday, 16 May 2000 11:35:46 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:13:58 UTC