Re: Action item on syntax-based interoperability from Michael Champion on 2003-10-24 (www-tag@w3.org from October 2003)

From: Michael Champion <mc@xegesis.org>
Date: Fri, 24 Oct 2003 11:11:12 -0400
To: www-tag@w3.org
Message-Id: <4DC5B944-0634-11D8-B71A-000A95CCC59E@xegesis.org>
Tim Bray writes:

"We disagree profoundly.  In my career I saw many ambitious API-centric 
attempts at smooth network interoperability prove to have lousy 
cost-effectiveness, including various RPC stacks, Corba, DCOM, and so 
on and so on.

The Web succeeded in many arenas where they failed, and
one important reason is that it never subscribed to the myth of the 
interoperable data model.  If the Webarch doc is not to be used to 
document the reasons for the success of the Web, then it is useless and 
we should stop working on it. "

I really hope the disagreement isn't as profound as it appears on the 
surface. Clearly the Web has succeeded in ways that RPC stacks, CORBA, 
etc. failed.  Most thoughtful observers seem to agree that it's not 
simply a matter of the Web "standards" being widely supported rather 
than fought over, and there are deeper architectural principles at work 
here.  Many analyses of the success of the Web point to the "loose 
coupling" among its components as the most fundamental architectural 
principle, and I for one agree that the textual format of Web protocols 
and the "view source principle" are an important aspect of  the Web's 
loose coupling.  But I still find the proposed draft text --
"The general success of Web software is evidence that interoperability 
in networked information systems is best achieved by specifying 
interfaces at the level of concrete syntax rather than abstract data 
models or APIs." --
quite overstated.  Maybe it would help to be more concrete about what I 
fear that it recommends and rejects by implication.

Looking back, it suggest that one should look mainly at the syntax, and 
not the implied data model and processing model of HTTP and HTML in 
accounting for the success of the Web.  In my understanding, the 
"textuality" (grin) of HTML was indeed a necessary condition for the 
Web's success, but it also required a shared conceptual model of what a 
Web page "is" to succeed in the fashion it actually did.  That is, to 
the best of my knowledge (having never actually looked at Web browser 
code), the way a real browser makes sense of "tag soup" is to try to 
fit it into an abstract or concrete data model of a Web page.  
Logically the Web COULD have worked if browser developers had worked 
from the "concrete syntax" of the HTML specs and applied rules for how 
to display it, but AFAIK no one had much success working forward from 
the pure grammar to build a viable web browser.  We can all lament tag 
soup, but any analysis of the success of the Web that doesn't account 
for its pervasiveness seems a bit sterile to me.  I think one can 
account for its success by postulating that  both the human readable 
text and a shared abstract conception of what a Web page consists of 
were necessary conditions.   That seems quite inconsistent with the 
proposed text about the primacy of syntax.

Looking ahead, I fear that this implies that XQuery will not be seen by 
the TAG as a viable platform for interoperability over the Web, and 
that IMHO is the whole POINT of XPath and XQuery in many real world 
situations.  I'm not by any means speaking officially for my employer, 
but we have essentially reconciled the conflicting demands of 
efficiently querying and faithfully representing XML by saying 
something like "we store and query an Infoset representation of input 
XML and return a serialization of that Infoset; some information  will 
be returned in a logically identical way that has a different concrete 
syntax."  XQuery takes this further and explicitly builds on a 
reference data model that is sufficiently abstract to describe data 
that has never been wrapped in an angled bracked, e.g. an RDBMS table.  
I see this as profoundly important to the Web, because it allows 
concrete syntax in XML files, XML information in XML databases, and 
non-XML data in Object-Relational databases to be processed and 
integrated within a common framework over the Web. Tim's reply 
addresses the case  "where the data being processed has been pulled out 
of an RDBMS (this is a good thing, but it's not a Web thing)."  Really? 
   Maybe I'm missing something here.

So, I agree with Tim that API-alone, datamodel-alone architectures 
don't support interoperability, but the Web past and future requires 
more than JUST concrete syntax to support interoperability. I hope the 
TAG adds some words along these lines. If not, I would like to hear an 
explanation of why  XML Infoset databases and RDBMS systems that are 
queried via XQuery over HTTP should not be considered to be 'a Web 
thing' :-)
Attachments

text/enriched attachment: stored
Received on Friday, 24 October 2003 11:11:14 UTC