- From: Yaron Goland <yarong@Exchange.Microsoft.com>
- Date: Thu, 24 Feb 2000 00:13:24 -0800
- To: "'Eric Sedlar'" <esedlar@us.oracle.com>
- Cc: w3c-dist-auth@w3.org, "'yaron@goland.org'" <yaron@goland.org>
- Message-ID: <7DE119D3D0E15543874F7561EECBDBED02619DFA@BEG.platinum.corp.microsoft.com>
(Note: I found this in my drafts directory, figured I would send it out) See below [WARNING: O.k. I admit it, Eric's letter sent me down memory lane. So this e-mail is seriously long. However if you are interested in some of the history behind how XML ended up the way it is, how XML ended up in WebDAV and some of the work I and others at MS did to make WebDAV actually happen, read on!] -----Original Message----- From: Eric Sedlar [mailto:esedlar@us.oracle.com] Sent: Mon, January 03, 2000 4:33 PM To: Yaron Goland Cc: w3c-dist-auth@w3.org Subject: Re: Your proposal Now we get into our areas of disagreement. * I mentioned HTTP as having lousy performance WITHIN A SINGLE PROCESS. I guarantee you that a function call is much cheaper than formatting the request into a stream, writing into a loopback socket, reading the result back, parsing it, and then the reverse. Another example is SQL access to properties of a resource. If I want to query the resource data efficiently, I need to be able to treat properties of the resource as "virtual data", not make a series of function calls on each item in the query set. If I want to write a servlet that takes an HTML form input and changes WebDAV properties, I definitely want a functional API (in Java). Everyone knows that SMB sucks and is incredibly chatty. As far as HTTP performance, it mostly depends on the amount of functionality you can pack into one request. If I have to issue a GETLOCKED name request followed by a bunch of other requests to get all of the state of a particular resource, HTTP will suck. My point was that the WebDAV model is actually bigger than HTTP--other data access systems (like SQL and XML technologies) should also be able to manipulate the data easily. <Yaron> I certainly wouldn't want someone accessing a DB on their local machine to have to run their requests through DAV. What a waste, a full trip through the protocol stack just for a local request? Of course they should be able to work through SQL and other mechanisms. In fact, in fulfilling my role as the lead author of the WebDAV specification and the chief WebDAV evangelist at Microsoft I was worked hand-in-hand with the ODBC/OLE DB/ADO teams at Microsoft to ensure that the model we were putting together would be easily accessible through their APIs. That is why, for example, OLD DB 2.5 has a native WebDAV provider. This is what was used to implement the WebDAV support in Web Folders. I also worked hand-in-hand with the SMB folks at Microsoft to make sure that the WebDAV model was such that we could run WebDAV directly over the file system and at the same time access it through the Win32 file APIs and SMB without any conflicts. That is why Win 2000 is able to ship a native WebDAV implementation that works directly off the file system allowing for simultaneous access by FTP/SMB/WebDAV/Win 32. At the same time I also coordinated with the Exchange group at Microsoft to ensure that the WebDAV model was consistent with POP/IMAP/MAPI. That is why Exchange 2000 (the upcoming release of Exchange) has a full WebDAV implementation that enables you to access your e-mail box simultaneously through POP/IMAP/MAPI/WebDAV. I also coordinated with our Active Directory team to ensure that the WebDAV model was consistent with LDAP/ADSI. I also coordinated with the SQL team to ensure that WebDAV could be easily layered on top of SQL so that one could do direct SQL queries as well as normal work. There are a number of other groups that I also coordinated with but unfortunately I can't release any data about that. But look forward to some interesting uses of WebDAV coming down the line. My favorite current example is pointing Office 2000 against Exchange 2000. Office sees Exchange as just another file store while Exchange sees Office as just another WebDAV client. The result is that you can save your files in your e-mail directories thus having your files automatically replicated and backed up for you. Very cool use of DAV. I think we will see even cooler uses of DAV when DASL starts to get deployed. With DAV/DASL you can control/search your E-mail store, file store, directory or database with the same protocol using the same data model. I have been working for over 3 years to bring this vision to fruition. So obviously I am deeply interested in the issues of enabling access to stores that support WebDAV through many APIs and protocols.. That having been said I disagree with your statement that having to make multiple requests to describe a resource will destroy performance. A properly implemented HTTP server will support pipelining which, as repeated experiments have demonstrated, has equivalent performance to having one big message but with much cleaner protocol semantics. So long as the connection is kept open (default in HTTP/1.1) and the process is not spun down after each request there is no difference between processing one huge message and processing a string of messages that are pipelined in. In fact, in the WebDAV Book of Why I discuss in detail how depth came about and why we used depth instead of just pipelining. It is an interesting story, you will see that while performance was involved, it had nothing to do with the cost of parsing messages. You might want to read that section entitled "<"This is another fine protocol you've gotten me into!">" available in <http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html> http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html. Keep in mind, btw, that the server does need to be a bit smart. For example, it should wait a little bit before processing a message in order to see if the next message can be bundled into a single API call to the underlying system. There are lots of tricks but these tricks are all well known. So it isn't a big deal. </Yaron> * Another reason for making all information about a resource available as properties is using XSL to format an HTML page giving access to all of this data. There is a lot of support for not requiring a lot of programming (ala JSPs or ASPs) to format data within a page. <Yaron> No one I'm aware of is arguing against making the information available through XML. The argument is simply this: Do we get the information through a PROPFIND or do we get it through multiple requests, each of which are free to return XML. My response is that we must do the later because of the need to support content negotiation. In order to keep the WebDAV property model as simple as possible it does not support any form of content negotiation. This means that I can't say "I want the name property in French" or "I want the picture you have stored in this property returned as JPEG." The only way to get this sort of negotiation is through individual method requests. If we only provide these properties through PROPFIND then we dumb DAV down to the level of a static database table and prevent people from developing more intelligent systems. The only real counter argument I see against using multiple methods is the possible performance ramifications but as I argued in the previous paragraph the introduction of pipelining into HTTP/1.1 makes this largely a non-issue. </Yaron> * I don't understand why you think it will be easier to handle protocol extensions on the client side by using additional HTTP methods rather than resources. XML applications are used to ignoring elements that they don't care about--extensibility is one of the key wins of XML vs. traditional OO languages. It allows DOM objects to be passed around multiple areas of the system better. For example, let's say that I have some client library which implements the client side of the HTTP protocol, and allows plugins to that client to access the XML DOM for each resource. I can install a new client plugin to handle new properties that appear in that DOM, but I won't be able to extend the HTTP protocol. HTTP protocol implementation is typically at a lower level of the client or server than properties. <Yaron> The main reason that I believe that adding new methods is easy is that I have implemented a number of XML/DAV systems and it has repeatedly proven simpler to support new methods than new elements. This is mostly because the HTTP model is flat so that I can access things like the method name or a head directly where as with XML I have to navigate through a tree. This is the same logic that leads people to use attributes in XML, which as I argue in the WebDAV book of why is a really stupid thing to do. But I digress. As for ignoring elements, it is funny to hear someone say that XML is used to ignoring elements. I first became involved with XML before it was called XML. I had helped write a proposal called Web Collections that would later be put together with a bunch of other proposals in order to become what we now know as XML. In fact, my boss at the time was Thomas Reardon who was MS's representative to the W3C and the leading proponent both at MS and the W3C for XML. Now a days only the old timers remember the work he did in making XML real but in my estimation without him XML never would have taken off. As such I have been involved in helping with the XML design since day one. One of the problems we ran into with XML when it was first introduced was the question of handling unrecognized elements. At the time the SGML heads had total control over XML much to the anger and dismay of those of us born after the turn of the last century. At the time the SGML heads were arguing hard and fast against have a definition of well formidness. What they wanted to do was to require a DTD be available for all XML documents. These were the same geniuses who argued against the introduction of namespaces. In fact it was their sheer cussedness that prevented us from being able to just name an XML element as a URL. They refused to allow in the forward slash and other characters that would have been necessary to allow URLs in the name of elements. That is where the namespace hack came from. Anyway, I'm digressing. So the SGML heads were arguing that you always had to have a DTD so therefore, naturally, if you had an "unrecognized" element it was an error and you should fail the entire XML document! Those of us from the post-Diluvium age were aghast. Unfortunately Tim Berners-Lee (TBL) was convinced that we had to keep support of the SGML community. I remember getting lectured by Andrew Layman (who hated what the SGML guys were doing as well) about the "millions" of SGML documents out there and the need to work with them. Of course now nobody knows or cares about those ***BLEEP*** SGML documents, but a lot of the really awful crap you see in XML is the result of the deals that TBL made with the SGML guys to keep their support. TBL touches on this topic a little bit in his book. Anyway, the SGML guys were finally argued into accepting the idea of "well formidness" and thus we were freed of needing a DTD. However this left open the question of what to do when you hit an element you didn't recognize. Oh man were there flame fests on that topic. People in the protocol business tend to think of XML as just a tree so for us it is natural to just prune the XML tree at an unrecognized node. But the SGML guys don't see a tree when they see XML, they see a markup language. So, for example, imagine I have the following entry in my XML document "<p>He was a <bold>grand</bold> man!</p>. Now let us imagine that your XML parser supports the <p> tag but doesn't support <bold>. If you just pruned the tree you would end up with "<p>He was a man!</p>". That is clearly not what you want. The result was just a real mess. Some people demanded that you always prune. Some demanded that you only removed the tags but not the text. Some argued that you should have an attribute you would put on each element that would specify what you should do. There never was much of a resolution to the issue. Eventually I just got really pissed off and added what is now the first paragraph of section 14 of RFC 2518 to WebDAV declaring that we would prune any sub-tree whose root we didn't recognize. Man, did we ever catch hell for that! The XML guys told us that we wouldn't be compatible with their XML. That we were creating my own world that wouldn't work with the rest of the XML universe. Oh man, what a bloody mess. So to hear you say that the WebDAV ignore rule (as we called it) is a strength of XML is pretty damn ironic. As for HTTP generally appearing at a lower layer and not being available from the DOM, you may want to check out http://msdn.microsoft.com/xml/reference/scriptref/XMLHttpRequest_object.asp <http://msdn.microsoft.com/xml/reference/scriptref/XMLHttpRequest_object.asp > and http://msdn.microsoft.com/xml/reference/cvbref/IXMLHttpRequest_interface.asp <http://msdn.microsoft.com/xml/reference/cvbref/IXMLHttpRequest_interface.as p> . The links describe the XMLHTTP interface available through IE 5.0's DOM. The first link describes the interface for script programmers (JScript/ECMAScript, VBscript, Python, Perl, etc.) and the second for C/VB programmers. This interface allows you to send and receive generic HTTP messages directly through IE 5.0's DOM. What is really cool about it is that while it can handle arbitrary HTTP messages it is smart so that if you want to send or if you receive a message with an XML body it will automatically detect that and automatically parse out, lode up and pass back a DOM pointer to the contents. So actually HTTP access is as free and ready as XML access, at least in IE 5.0. BTW, massive credit goes to Alex Hopmann who nearly single handedly convinced the IE guys to ship this feature. Hats off to Alex! </Yaron> HTTP request/response parsers are not generic technology like XML parsers are <Yaron> I'm not sure what leads you to this conclusion. For example, as the author of the HTTP over UDP spec I found that HTTP request/response parsers were incredibly generic. This is what made HTTP over UDP so compelling. We could take existing parsers, do a little tweaking and bam, it just worked. In fact I would argue that there is no better proof of the generic nature of HTTP parsers then the existence of WebDAV. We didn't have to do any major work to support WebDAV in our HTTP request/response parsers. HTTP already understood the concept of methods and headers so we were able to use the existing parsers to implement WebDAV. In fact the biggest problem we ran into is that some folks had implemented checks on what methods you could send as an error detection procedure! Sigh... In WinInet we had the problem that the devs, in order to optimize performance, had built a method mapping table that turned the method names into tokens and then moved the tokens around. Unfortunately the tokens were fixed, which meant that WinInet (our HTTP client stack) couldn't handle new methods. But the problem was easily fixed so modern versions of WinInet can handle arbitrary methods. </Yaron> * It sounds like you are arguing against the general trend of making all of your data available via XML, which is counter to the general strategies most technology companies are following now, including Microsoft and Oracle. With your proposal, you are going to render all of the burgeoning XML technology base useless, which is why the WebDAV people built around XML in the first place. <Yaron> I was the one who introduced XML into WebDAV. Bonus points to anyone who remembers the original meeting when I proposed Web Collections (a precursor to XML) to WebDAV. This was the first time WebDAV was introduced to what would become XML. As for Microsoft's XML strategy, that is a whole other story. Because my boss, Thomas Reardon, lead the XML effort at MS for a long time I was actually part of creating our XML strategy. When Adam Bosworth took over XML I ended up having weekly meetings with him just so that we could synch up on Microsoft XML corporate strategy. I also worked with both Andrew Layman and Jean Paoli on XML issues. I still remember the tongue lashing I gave Andrew over fully qualified end tags. It wasn't Andrew's fault, he hated it too, but he was the messenger and I was really angry. Basically the SGML heads forced XML to adopt fully qualified end tags. In the original proposal XML markup looked like <tag>stuff</>. But the SGML heads argued that this didn't make XML readable so we should make the already mind blowingly bloated XML format even more bloated by adding fully qualified end tags, i.e. <tag>stuff</tag>. Yet another fight that they won. As such I can safely assure you that I would not do anything to "render all of the burgeoning XML technology base useless". Having helped birth XML I would certainly be reticent about killing it. After all, infanticide is a messy business. I will admit, however and Jim Whitehead will back me up, that there were plenty of times when I would have have liked to have given XML a severe spanking followed by a good amount of time in the corner. I used to joke about the "XML Jedi Mind Trick." You just wave your hand in front of someone and say "You will use XML". Their eyes loose focus and they repeat in a monotone voice "I will use XML." This helped a lot in early WebDAV adoption. If I couldn't convince someone to adopt WebDAV because of its own merits I would just pull the XML Jedi Mind Trick and they would adopt it because it used XML. After all, if it uses XML it must be cool! There is apparently something indescribably sexy about angle brackets. </Yaron> Is anyone thinking about a thin-client WebDAV implementation (HTML only) running in a browser? I should be able to do stuff like changing RSRs, manipulate directories, etc. with a web app. <Yaron> Thinking? We have already written it! Using XMLHTTP Exchange has written up an entire E-Mail client for Exchange 2000 that speaks WebDAV to their server. The whole thing is written 100% in nothing but script!!!! What is really amazing is that it uses DHTML so the UI looks almost 100% like Outlook, even down to the multiple Outlook views and the Outlook bar. It is seriously cool. I have also written, on my own, a generic WebDAV client library in JScript. Unfortunately I never finished it. Too many other things to do. </Yaron> --Eric Yaron Yaron Goland wrote: I wouldn't accept a blanket statement that HTTP has lousy performance. I have seen it repeatedly beat the pants off all competing systems. For example, we did a speed comparison of a super optimized SMB implementation and DAV in W2K. DAV won hands down. The reason is that SMB, like most protocols of its ilk, is unbelievably chatty. They are basically RPCs rather than protocols. DAV, on the other hand, is extremely optimized. So even though the SMB implementation could process some ridiculous number of messages per second DAV still had better performance because it sent a hell of a lot less messages.As for the use of properties, I assert that there is no difference between executing a GETLOCKEDNAME method and getting the lockedname property in terms of how you write your back end. However, the first is a hell of a lot easier to deal with in terms of specifying and extending the protocol than the second. So the issue isn't one of back end implementation, it is one of front end convenience.In other words, just say NO to live properties. Yaron -----Original Message----- From: Eric Sedlar [ mailto:esedlar@us.oracle.com <mailto:esedlar@us.oracle.com> ] Sent: Monday, January 03, 2000 1:09 PM To: Yaron Goland Cc: w3c-dist-auth@w3.org Subject: Re: Your proposal One problem with your qualms about properties is that we are trying to map WebDAV data to object representation systems that do not have functional semantics, like XML. We should define an interface that doesn't rely on the distinction between functional interfaces and properties for maximum implementability on various servers. (This distinction is something may programmers have trouble with--does everyone always bother to create accessor methods for everything? ...) The benefit of using live properties as a representation is that object properties are more "portable" to the other types of systems that may want to access the same data, presumably through another means than the HTTP protocol (which isn't particularly efficient). (Which brings me to another unrelated issue--should there be a functional interface to WebDAV methods for programs living in the same server as the data repository, given the performance costs of HTTP within a single process--more on that later). Yes you need a set of clear rules for how live properties are used, and unless their use is rigorously controlled, you will have compatability problems of the type you cite, but this is a problem with any loosely written standard.I think of properties in the JavaBeans sense--in an OO language binding, they would actually be functional interfaces to set and retrieve them, but could be overridden to provide customized behaviour. Any JavaBeans user has no idea whether or not this piece of data is live or not, and this model works well. --Eric ----- Original Message ----- From: Yaron Goland <mailto:yarong@Exchange.Microsoft.com> To: 'Eric <mailto:esedlar@us.oracle.com> Sedlar' Sent: Monday, January 03, 2000 11:18 AM Subject: Your proposal Eric, I read your analysis of Geoff's proposal and was really impressed by your deep grasp of both HTTP and WebDAV. I have a series of issues with your counter-proposal but I'm going to hold off on commenting until we can build up more of a common base for conversation. Please see my post on the mailing list in regards to this. I did, however, want to point out a general design issue regarding your proposal that isn't directly related to locks. In your proposal you suggest using properties to provide various bits of protocol information, such as which names are currently locked. I would caution against using properties in this way, see http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0302.html <http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0302.html> for more details. For a history of how we ended up in this mess in the first place see http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0074.html <http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0074.html> and http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html <http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html> . BTW, all of these posts are collected in the WebDAV Book of Why available at http://lists.w3.org/Archives/Public/w3c-dist-auth/1999JanMar/0129.html <http://lists.w3.org/Archives/Public/w3c-dist-auth/1999JanMar/0129.html> . Thanks, Yaron
Received on Thursday, 24 February 2000 03:13:57 UTC