RE: Your proposal from Yaron Goland on 2000-02-24 (w3c-dist-auth@w3.org from January to March 2000)

From: Yaron Goland <yarong@Exchange.Microsoft.com>
Date: Thu, 24 Feb 2000 00:13:24 -0800
To: "'Eric Sedlar'" <esedlar@us.oracle.com>
Cc: w3c-dist-auth@w3.org, "'yaron@goland.org'" <yaron@goland.org>
Message-ID: <7DE119D3D0E15543874F7561EECBDBED02619DFA@BEG.platinum.corp.microsoft.com>
(Note: I found this in my drafts directory, figured I would send it out)
 
See below 
[WARNING: O.k. I admit it, Eric's letter sent me down memory lane. So this
e-mail is seriously long. However if you are interested in some of the
history behind how XML ended up the way it is, how XML ended up in WebDAV
and some of the work I and others at MS did to make WebDAV actually happen,
read on!]

-----Original Message-----
From: Eric Sedlar [mailto:esedlar@us.oracle.com]
Sent: Mon, January 03, 2000 4:33 PM
To: Yaron Goland
Cc: w3c-dist-auth@w3.org
Subject: Re: Your proposal


Now we get into our areas of disagreement. 

* I mentioned HTTP as having lousy performance WITHIN A SINGLE PROCESS.  I
guarantee you that a function call is much cheaper than formatting the
request into a stream, writing into a loopback socket, reading the result
back, parsing it,  and then the reverse.  Another example is SQL access to
properties of a resource.  If I want to query the resource data efficiently,
I need to be able to treat properties of the resource as "virtual data", not
make a series of function calls on each item in the query set.  If I want to
write a servlet that takes an HTML form input and changes WebDAV properties,
I definitely want a functional API (in Java).  Everyone knows that SMB sucks
and is incredibly chatty.  As far as HTTP performance, it mostly depends on
the amount of functionality you can pack into one request.  If I have to
issue a GETLOCKED name request followed by a bunch of other requests to get
all of the state of a particular resource, HTTP will suck.   


My point was that the WebDAV model is actually bigger than HTTP--other data
access systems (like SQL and XML technologies) should also be able to
manipulate the data easily.   


<Yaron>

I certainly wouldn't want someone accessing a DB on their local machine to
have to run their requests through DAV. What a waste, a full trip through
the protocol stack just for a local request? Of course they should be able
to work through SQL and other mechanisms. In fact, in fulfilling my role as
the lead author of the WebDAV specification and the chief WebDAV evangelist
at Microsoft I was worked hand-in-hand with the ODBC/OLE DB/ADO teams at
Microsoft to ensure that the model we were putting together would be easily
accessible through their APIs. That is why, for example, OLD DB 2.5 has a
native WebDAV provider. This is what was used to implement the WebDAV
support in Web Folders. I also worked hand-in-hand with the SMB folks at
Microsoft to make sure that the WebDAV model was such that we could run
WebDAV directly over the file system and at the same time access it through
the Win32 file APIs and SMB without any conflicts. That is why Win 2000 is
able to ship a native WebDAV implementation that works directly off the file
system allowing for simultaneous access by FTP/SMB/WebDAV/Win 32. At the
same time I also coordinated with the Exchange group at Microsoft to ensure
that the WebDAV model was consistent with POP/IMAP/MAPI. That is why
Exchange 2000 (the upcoming release of Exchange) has a full WebDAV
implementation that enables you to access your e-mail box simultaneously
through POP/IMAP/MAPI/WebDAV. I also coordinated with our Active Directory
team to ensure that the WebDAV model was consistent with LDAP/ADSI. I also
coordinated with the SQL team to ensure that WebDAV could be easily layered
on top of SQL so that one could do direct SQL queries as well as normal
work. There are a number of other groups that I also coordinated with but
unfortunately I can't release any data about that. But look forward to some
interesting uses of WebDAV coming down the line. My favorite current example
is pointing Office 2000 against Exchange 2000. Office sees Exchange as just
another file store while Exchange sees Office as just another WebDAV client.
The result is that you can save your files in your e-mail directories thus
having your files automatically replicated and backed up for you. Very cool
use of DAV. 


I think we will see even cooler uses of DAV when DASL starts to get
deployed. With DAV/DASL you can control/search your E-mail store, file
store, directory or database with the same protocol using the same data
model. I have been working for over 3 years to bring this vision to
fruition. So obviously I am deeply interested in the issues of enabling
access to stores that support WebDAV through many APIs and protocols..  


That having been said I disagree with your statement that having to make
multiple requests to describe a resource will destroy performance. A
properly implemented HTTP server will support pipelining which, as repeated
experiments have demonstrated, has equivalent performance to having one big
message but with much cleaner protocol semantics. So long as the connection
is kept open (default in HTTP/1.1) and the process is not spun down after
each request there is no difference between processing one huge message and
processing a string of messages that are pipelined in. In fact, in the
WebDAV Book of Why I discuss in detail how depth came about and why we used
depth instead of just pipelining. It is an interesting story, you will see
that while performance was involved, it had nothing to do with the cost of
parsing messages. You might want to read that section entitled "<"This is
another fine protocol you've gotten me into!">" available in
<http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html>
http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html. Keep
in mind, btw, that the server does need to be a bit smart. For example, it
should wait a little bit before processing a message in order to see if the
next message can be bundled into a single API call to the underlying system.
There are lots of tricks but these tricks are all well known. So it isn't a
big deal.

</Yaron>

* Another reason for making all information about a resource available as
properties is using XSL to format an HTML page giving access to all of this
data.  There is a lot of support for not requiring a lot of programming (ala
JSPs or ASPs) to format data within a page.   


<Yaron> 


No one I'm aware of is arguing against making the information available
through XML. The argument is simply this: Do we get the information through
a PROPFIND or do we get it through multiple requests, each of which are free
to return XML. My response is that we must do the later because of the need
to support content negotiation. In order to keep the WebDAV property model
as simple as possible it does not support any form of content negotiation.
This means that I can't say "I want the name property in French" or "I want
the picture you have stored in this property returned as  JPEG." The only
way to get this sort of negotiation is through individual method requests.
If we only provide these properties through PROPFIND then we dumb DAV down
to the level of a static database table and prevent people from developing
more intelligent systems. The only real counter argument I see against using
multiple methods is the possible performance ramifications but as I argued
in the previous paragraph the introduction of pipelining into HTTP/1.1 makes
this largely a non-issue. 


</Yaron> 


* I don't understand why you think it will be easier to handle protocol
extensions on the client side by using additional HTTP methods rather than
resources.  XML applications are used to ignoring elements that they don't
care about--extensibility is one of the key wins of XML vs. traditional OO
languages.  It allows DOM objects to be passed around multiple areas of the
system better.  For example, let's say that I have some client library which
implements the client side of the HTTP protocol, and allows plugins to that
client to access the XML DOM for each resource.  I can install a new client
plugin to handle new properties that appear in that DOM, but I won't be able
to extend the HTTP protocol.  HTTP protocol implementation is typically at a
lower level of the client or server than properties.   


<Yaron> 


The main reason  that I believe that adding new methods is easy is that I
have implemented a number of XML/DAV systems and it has repeatedly proven
simpler to support new methods than new elements. This is mostly because the
HTTP model is flat so that I can access things like the method name or a
head directly where as with XML I have to navigate through a tree. This is
the same logic that leads people to use attributes in XML, which as I argue
in the WebDAV book of why is a really stupid thing to do. But I digress. 


As for ignoring elements, it is funny to hear someone say that XML is used
to ignoring elements. I first became involved with XML before it was called
XML. I had helped write a proposal called Web Collections that would later
be put together with a bunch of other proposals in order to become what we
now know as XML. In fact, my boss at the time was Thomas Reardon who was
MS's representative to the W3C and the leading proponent both at MS and the
W3C for XML. Now a days only the old timers remember the work he did in
making XML real but in my estimation without him XML never would have taken
off. As such I have been involved in helping with the XML design since day
one. One of the problems we ran into with XML when it was first introduced
was the question of handling unrecognized elements. At the time the SGML
heads had total control over XML much to the anger and dismay of those of us
born after the turn of the last century. At the time the SGML heads were
arguing hard and fast against have a definition of well formidness. What
they wanted to do was to require a DTD be available for all XML documents.
These were the same geniuses who argued against the introduction of
namespaces. In fact it was their sheer cussedness that prevented us from
being able to just name an XML element as a URL. They refused to allow in
the forward slash and other characters that would have been necessary to
allow URLs in the name of elements. That is where the namespace hack came
from. Anyway, I'm digressing. So the SGML heads were arguing that you always
had to have a DTD so therefore, naturally, if you had an "unrecognized"
element it was an error and you should fail the entire XML document! Those
of us from the post-Diluvium age were aghast. Unfortunately Tim Berners-Lee
(TBL) was convinced that we had to keep support of the SGML community. I
remember getting lectured by Andrew Layman (who hated what the SGML guys
were doing as well) about the "millions" of SGML documents out there and the
need to work with them. Of course now nobody knows or cares about those
***BLEEP*** SGML documents, but a lot of the really awful crap you see in
XML is the result of the deals that TBL made with the SGML guys to keep
their support. TBL touches on this topic a little bit in his book. Anyway,
the SGML guys were finally argued into accepting the idea of "well
formidness" and thus we were freed of needing a DTD. However this left open
the question of what to do when you hit an element you didn't recognize. Oh
man were there flame fests on that topic. People in the protocol business
tend to think of XML as just a tree so for us it is natural to just prune
the XML tree at an unrecognized node. But the SGML guys don't see a tree
when they see XML, they see a markup language. So, for example, imagine I
have the following entry in my XML document "<p>He was a <bold>grand</bold>
man!</p>. Now let us imagine that your XML parser supports the <p> tag but
doesn't support <bold>. If you just pruned the tree you would end up with
"<p>He was a man!</p>". That is clearly not what you want. The result was
just a real mess. Some people demanded that you always prune. Some demanded
that you only removed the tags but not the text. Some argued that you should
have an attribute you would put on each element that would specify what you
should do. There never was much of a resolution to the issue. Eventually I
just got really pissed off and added what is now the first paragraph of
section 14 of RFC 2518 to WebDAV declaring that we would prune any sub-tree
whose root we didn't recognize. Man, did we ever catch hell for that! The
XML guys told us that we wouldn't be compatible with their XML. That we were
creating my own world that wouldn't work with the rest of the XML universe.
Oh man, what a bloody mess. So to hear you say that the WebDAV ignore rule
(as we called it) is a strength of XML is pretty damn ironic. 


As for HTTP generally appearing at a lower layer and not being available
from the DOM, you may want to check out
http://msdn.microsoft.com/xml/reference/scriptref/XMLHttpRequest_object.asp
<http://msdn.microsoft.com/xml/reference/scriptref/XMLHttpRequest_object.asp
>  and
http://msdn.microsoft.com/xml/reference/cvbref/IXMLHttpRequest_interface.asp
<http://msdn.microsoft.com/xml/reference/cvbref/IXMLHttpRequest_interface.as
p> . The links describe the XMLHTTP interface available through IE 5.0's
DOM. The first link describes the interface for script programmers
(JScript/ECMAScript, VBscript, Python, Perl, etc.) and the second for C/VB
programmers. This interface allows you to send and receive generic HTTP
messages directly through IE 5.0's DOM. What is really cool about it is that
while it can handle arbitrary HTTP messages it is smart so that if you want
to send or if you receive a message with an XML body it will automatically
detect that and automatically parse out, lode up and pass back a DOM pointer
to the contents. So actually HTTP access is as free and ready as XML access,
at least in IE 5.0. BTW, massive credit goes to Alex Hopmann who nearly
single handedly convinced the IE guys to ship this feature. Hats off to
Alex! 


</Yaron> 


HTTP request/response parsers are not generic technology like XML parsers
are   


<Yaron> 


I'm not sure what leads you to this conclusion. For example, as the author
of the HTTP over UDP spec I found that HTTP request/response parsers were
incredibly generic. This is what made HTTP over UDP so compelling. We could
take existing parsers, do a little tweaking and bam, it just worked. In fact
I would argue that there is no better proof of the generic nature of HTTP
parsers then the existence of WebDAV. We didn't have to do any major work to
support WebDAV in our HTTP request/response parsers. HTTP already understood
the concept of methods and headers so we were able to use the existing
parsers to implement WebDAV. In fact the biggest problem we ran into is that
some folks had implemented checks on what methods you could send as an error
detection procedure! Sigh... In WinInet we had the problem that the devs, in
order to optimize performance, had built a method mapping table that turned
the method names into tokens and then moved the tokens around. Unfortunately
the tokens were fixed, which meant that WinInet (our HTTP client stack)
couldn't handle new methods. But the problem was easily fixed so modern
versions of WinInet can handle arbitrary methods. 


</Yaron> 


* It sounds like you are arguing against the general trend of making all of
your data available via XML, which is counter to the general strategies most
technology companies are following now, including Microsoft and Oracle.
With your proposal, you are going to render all of the burgeoning XML
technology base useless, which is why the WebDAV people built around XML in
the first place.   


<Yaron> 


I was the one who introduced XML into WebDAV. Bonus points to anyone who
remembers the original meeting when I proposed Web Collections (a precursor
to XML) to WebDAV. This was the first time WebDAV was introduced to what
would become XML.  


As for Microsoft's XML strategy, that is a whole other story. Because my
boss, Thomas Reardon, lead the XML effort at MS for a long time I was
actually part of creating our XML strategy. When Adam Bosworth took over XML
I ended up having weekly meetings with him just so that we could synch up on
Microsoft XML corporate strategy. I also worked with both Andrew Layman and
Jean Paoli on XML issues. I still remember the tongue lashing I gave Andrew
over fully qualified end tags.  It wasn't Andrew's fault, he hated it too,
but he was the messenger and I was really angry. Basically the SGML heads
forced XML to adopt fully qualified end tags. In the original proposal XML
markup looked like <tag>stuff</>. But the SGML heads argued that this didn't
make XML readable so we should make the already mind blowingly bloated XML
format even more bloated by adding fully qualified end tags, i.e.
<tag>stuff</tag>. Yet another fight that they won. 


As such I can safely assure you that I would not do anything to "render all
of the burgeoning XML technology base useless". Having helped birth XML I
would certainly be reticent about killing it. After all, infanticide is a
messy business. I will admit, however and Jim Whitehead will back me up,
that there were plenty of times when I would have have liked to have given
XML a severe spanking followed by a good amount of time in the corner. 


I used to joke about the "XML Jedi Mind Trick." You just wave your hand in
front of someone and say "You will use XML". Their eyes loose focus and they
repeat in a monotone voice "I will use XML." This helped a lot in early
WebDAV adoption. If I couldn't convince someone to adopt WebDAV because of
its own merits I would just pull the XML Jedi Mind Trick and they would
adopt it because it used XML. After all, if it uses XML it must be cool!
There is apparently something indescribably sexy about angle brackets.  


</Yaron> 


Is anyone thinking about a thin-client WebDAV implementation (HTML only)
running in a browser?  I should be able to do stuff like changing RSRs,
manipulate directories, etc. with a web app.   


<Yaron> 


Thinking? We have already written it! Using XMLHTTP Exchange has written up
an entire E-Mail client for Exchange 2000 that speaks WebDAV to their
server. The whole thing is written 100% in nothing but script!!!! What is
really amazing is that it uses DHTML so the UI looks almost 100% like
Outlook, even down to the multiple Outlook views and the Outlook bar. It is
seriously cool. I have also written, on my own, a generic WebDAV client
library in JScript. Unfortunately I never finished it. Too many other things
to do.  


</Yaron> 


--Eric   


        Yaron 


  


Yaron Goland wrote: 


I wouldn't accept a blanket statement that HTTP has lousy performance. I
have seen it repeatedly beat the pants off all competing systems. For
example, we did a speed comparison of a super optimized SMB implementation
and DAV in W2K. DAV won hands down. The reason is that SMB, like most
protocols of its ilk, is unbelievably chatty. They are basically RPCs rather
than protocols. DAV, on the other hand, is extremely optimized. So even
though the SMB implementation could process some ridiculous number of
messages per second DAV still had better performance because it sent a hell
of a lot less messages.As for the use of properties, I assert that there is
no difference between executing a GETLOCKEDNAME method and getting the
lockedname property in terms of how you write your back end. However, the
first is a hell of a lot easier to deal with in terms of specifying and
extending the protocol than the second. So the issue isn't one of back end
implementation, it is one of front end convenience.In other words, just say
NO to live properties.            Yaron 

-----Original Message----- 
From: Eric Sedlar [ mailto:esedlar@us.oracle.com
<mailto:esedlar@us.oracle.com> ] 
Sent: Monday, January 03, 2000 1:09 PM 
To: Yaron Goland 
Cc: w3c-dist-auth@w3.org 
Subject: Re: Your proposal 
 
One problem with your qualms about properties is that we are trying to map
WebDAV data to object representation systems that do not have functional
semantics, like XML.  We should define an interface that doesn't rely on the
distinction between functional interfaces and properties for maximum
implementability on various servers.  (This distinction is something may
programmers have trouble with--does everyone always bother to create
accessor methods for everything?  ...)  The benefit of using live properties
as a representation is that object properties are more "portable" to the
other types of systems that may want to access the same data, presumably
through another means than the HTTP protocol (which isn't particularly
efficient).  (Which brings me to another unrelated issue--should there be a
functional interface to WebDAV methods for programs living in the same
server as the data repository, given the performance costs of HTTP within a
single process--more on that later). Yes you need a set of clear rules for
how live properties are used, and unless their use is rigorously controlled,
you will have compatability problems of the type you cite, but this is a
problem with any loosely written standard.I think of properties in the
JavaBeans sense--in an OO language binding, they would actually be
functional interfaces to set and retrieve them, but could be overridden to
provide customized behaviour.  Any JavaBeans user has no idea whether or not
this piece of data is live or not, and this model works well. --Eric 

----- Original Message -----
From: Yaron Goland <mailto:yarong@Exchange.Microsoft.com> 
To: 'Eric  <mailto:esedlar@us.oracle.com> Sedlar'
Sent: Monday, January 03, 2000 11:18 AM
Subject: Your proposal
 Eric, I read your analysis of Geoff's proposal and was really impressed by
your deep grasp of both HTTP and WebDAV. 

I have a series of issues with your counter-proposal but I'm going to hold
off on commenting until we can build up more of a common base for
conversation. Please see my post on the mailing list in regards to this. 


I did, however, want to point out a general design issue regarding your
proposal that isn't directly related to locks. In your proposal you suggest
using properties to provide various bits of protocol information, such as
which names are currently locked. I would caution against using properties
in this way, see
http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0302.html
<http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0302.html>
for more details. For a history of how we ended up in this mess in the first
place see
http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0074.html
<http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0074.html>
and http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html
<http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html> .
BTW, all of these posts are collected in the WebDAV Book of Why available at
http://lists.w3.org/Archives/Public/w3c-dist-auth/1999JanMar/0129.html
<http://lists.w3.org/Archives/Public/w3c-dist-auth/1999JanMar/0129.html> . 


                Thanks, 
                                Yaron
Received on Thursday, 24 February 2000 03:13:57 UTC