RE: [optimization] HTTP properties from David Orchard on 2004-05-10 (www-ws-desc@w3.org from May 2004)

From: David Orchard <dorchard@bea.com>
Date: Mon, 10 May 2004 09:40:45 -0700
To: "=SMTP:www-ws-desc@w3.org" <"'=SMTP:www-ws-desc"@w3.org'>
Message-ID: <32D5845A745BFB429CBDBADA57CD41AF07546EF8@ussjex01.amer.bea.com>
A very useful analysis and I'll incorporate into a follow on note on the http properties.  However, I wonder about the content/transfer codings being "negotiated" and thus ignorable.

I'm thinking of the case where a service supports gzipped and non-gzipped input messages.  If the client has no way of knowing which the service supports, it must pick one.  It seems that 99.999% of Web service clients would pick the non-gzipped coded message.  This will work fine, but be inefficient.  The server may respond with an "i support gzip" header but that means the first message will still be inefficient.  I guess a client could pick gzip first, and then either be told this is ok or be told that gzipped isn't supported.  But this by no means seems the common case.

If a service could declare that it supports the gzipped coded client in the wsdl, then this performance problem on the first interaction could be quite solved.  

It also seems to me that in cases where the service supports gzipped and non-gzipped, there will typically be some communication at the service description level to say this.  A client would then be instructed to try the "gzipped" first.  If there will be some out-of-band communication for this metadata, then it seems entirely useful to make the metadata into wsdl in-band.  

There is also the expected state transition model.  In the case where a Web service operation accepts gzip and the client sends multiple messages of the same type, this performance hit may be not that great.  In the case of where a Web service state transition table moves through many states with no operations that are done multiple times, then effectively gzip can't/won't be used.  

FWIW, there is some interesting analysis of the cons of agent and server driven negotiation in 2616 sections 12.1 and 12.2

Finally, the issue of coding may be even more interesting to Web services than to Web browsers because of the potential for description that spans operations.  In a common Web service, there are many different operations that all may use gzip on input and output messages.  The content coding negotation for http is for a single resource, not for multiple resources.  In the common case of multiple operations, the negotiation must be done on each http operation.

In the same way my tooling might say "all operations are secured using ws-security username/password and reliably delivered using ws-r?", I might want to say "all operations accept/respond with gzip encoding".  It seems easier to interchange this information and gain the performance.

In summary, I think that theoretically not describing gzip encoding in WSDL doesn't preclude it's use, in practical terms WSDL needs to describe gzip encoding in order for gzip to be used by Web services.

Cheers,
Dave

> -----Original Message-----
> From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org]On
> Behalf Of Jonathan Marsh
> Sent: Thursday, May 06, 2004 1:29 PM
> To: www-ws-desc@w3.org
> Subject: FW: [optimization] HTTP properties
> 
> 
> 
> At today's telcon, we organized the "probably useful" into 
> two buckets.
> The first bucket represents description of HTTP features that can be
> negotiated at run-time.  Such description provides hints (implication:
> ignorable) about the capabilities of the server.  This allows a client
> to optimize communication.  The items in this bucket are:
> 
>     - HTTP Version
>     - Content coding
>     - Transfer Codings (Chunked encoding)
>     - Caching (Vary, etc.)
>     - Content Negotiation ?
> 
> There is no consensus yet on whether we should provide description of
> such features in the WSDL spec.  I encourage more discussion on this
> topic.
> 
> -----Original Message-----
> From: www-ws-desc-request@w3.org 
> [mailto:www-ws-desc-request@w3.org] On
> Behalf Of David Orchard
> Sent: Thursday, April 29, 2004 2:56 AM
> To: www-ws-desc@w3.org
> Subject: HTTP properties
> 
> 
> This message contains a list of HTTP properties that may be 
> of interest
> to WSDL description.  I have provided some starter comments on the
> relative utility of being described by WSDL.  The categories 
> that I used
> are: described by WSDL 2.0, probably useful, probably not useful, not
> useful or applicable (n/a).  In summary, I found roughly: 4 properties
> that are described by WSDL 2.0, 9 probably useful categories, 
> 2 probably
> not useful, and 8 not useful/application.
> 
> The properties typically related to messages.  Some may also 
> be related
> to the sender or receiver role and these have generally been noted, ie
> httpversion is property of message sender, and ssl cyphersuite is a
> property of a receiver.  These sender/receiver related properties will
> then be realized in messages in property dependent manner.
> 
> In doing this work, I had some insights.  Firstly, there is a
> relationship between a property in a wsdl definition and a property in
> the message for a given feature, ie request-URI.  I'm not sure if WSDL
> would need to describe both in the context of HTTP.   
> Secondly, HTTP is
> designed for run-time negotiation.  In other words, it is 
> designed as if
> WSDL, or something similar, is not available for a web site 
> and related
> resources.  As a result, various properties that a receiver 
> may wish to
> advertise are covered by run-time negotiation, ie content-type and
> allows.
> 
> List of HTTP properties:
> 
> Request-Method - described by wsdl 2.0 binding operation webmethod
> property 
> 
> Request-URI - described by a portion of the wsdl 2.0 binding location
> property
> 
> HTTP Version - probably useful to describe in an httpversion property.
> HTTP's version field indicates the version of the message and the
> capabilities of the message sender.  From rfc 2616 "The protocol
> versioning policy is intended to allow the sender to indicate 
> the format
> of a message and its capacity for understanding further HTTP
> communication, rather than the features obtained via that
> communication".  Given that HTTP 1.1 is backwards compatible with HTTP
> 1.0, the interesting case is where a receiver wants to 
> advertise that it
> only supports 1.0 features.  See also rfc 2145
> 
> Response Status - probably useful to describe only for redirection
> support.  In general, it seems more useful to bind the status to the
> SOAP Fault infoset than to describe in WSDL 2.0.  The case that may be
> interesting from a WSDL perspective is redirection, see below.
>  
> Content coding(Accept-Encoding and Content-Encoding) - probably useful
> to describe as the encoding of the content before and after 
> transfer is
> significant to applications.  
> 
> Transfer Codings(Chunked encoding) - These is probably useful to
> describe for sending of compressing messages.   WSDL could allow
> definition of codings of input and output messages.  For example, a
> Purchase order submission service that allowed gzip coded 
> message in an
> HTTP PUT could advertise that it accepts either normal or gzip coded
> messages.  Without this feature, a sender that understands gzip would
> have to gzip the PUT and send it, and if it is an error then 
> it send it
> coded normal.  The ability to describe alternate encodings allows
> avoidance of this classic failure followed by retry using downgraded
> message message pattern that has known network performance problems.
> Also, it may be useful to describe support for chunked 
> request bodies so
> that a sender can have a more efficient implementation.
> 
> Persistent connections - This is default but not required behaviour of
> HTTP 1.1.  It is probably not useful to describe in WSDL.  
> For example,
> what should a client do differently for services that support 
> persistent
> connections versus those that don't that is helped by being 
> described by
> WSDL? Though this could be used to advertise "policy" of 
> connections, ie
> the receiver will process N requests and then drop the connection, for
> helping clients optimize.
> 
> Redirection - probably useful to describe.  A sender is probably
> interested in design-time knowledge on whether redirection is 
> supported
> by a service.  
> 
> Authentication - probably useful to describe, particularly the various
> parameters of authentication.  For example, Basic vs. Digest vs.
> extension mechanism, various parameters such as realm, etc.  
> There seems
> to be a possible overlap with WS-Security.
> 
> SSL - probably useful to describe, but it is probably sufficiently
> described by using URIs with https.  It may be useful if WSDL 
> described
> the properties of SSL, such as cyphersuites and certificate 
> authorities.
> 
> From - probably not useful to describe.  WSDL could describe 
> that a From
> field is required in a request, though this doesn't seemed used very
> often.  The HTTP spec specifies that From header SHOULD be a mailto:
> address of the human operator, but it may be interesting to generalize
> this for Web services.  
> 
> Caching (Vary, etc.) - probably useful to describe.  This seems
> potentially useful for caching the results of safe retrieval, ie GET
> operations.  WSDL could provide description of the caching 
> properties of
> a service.  The knowledge that the operation results are 
> cacheable could
> be used by a client application for client-side caching.  
> However, this
> may not be very useful.  If a Web service provides a cacheable
> operation, then the client will know from the resulting http headers.
> This seems like metadata to help a client implementation and 
> the service
> deployment, but is not required for interoperability.  
> 
> Content Negotiation (Content-Type and Accept) - probably useful to
> describe, particularly useful for indicating use of the MTOM 
> mechanism.
> An HTTP Web service may want to describe the types that it will return
> for a given URI and then a client can set the accept.  This would
> provide design time information to the client.  However, this doesn't
> seem too useful in general from a Web services perspective.  
> This allows
> a client and server to negotiate a hypertext format for 
> exchange without
> knowing ahead of time what formats are available.  In Web services, a
> service that wants to return format X versus format Y for a given
> request will typically create multiple methods, ie getX and getY.  It
> might be possible for WSDL to allow the "normal" description of
> operations, ie getX and getY, and then binding operation that binds to
> the two operations to the same binding operation and setting 
> the Accept
> header to either X or Y.  In the case of HTTP GET, the only 
> parameter of
> the operation would be the type requested, and this would be 
> sent as the
> accept header.  Perhaps the binding operation would have an "accept"
> property?  There is also a mismatch between HTTP's use of 
> MIME types and
> WSDL's use of XML Qnames for types.  The Content-type is potentially
> tricky as it be the differentiator on the return, so the 
> receiver would
> have to use it to determine which operation was invoked.  This is
> somewhat similar to function return overloading.
> 
> Host - described by a subset of the wsdl location property.  
> 
> Content-Range - not useful to describe. 
> 
> If-* - not useful to describe.  Meant for cache validation of existing
> content, so can't be specified in WSDL.
>  
> Max-Forwards - not useful to describe.
> 
> Expect - not useful to describe.  The same functionality is 
> provided in
> SOAP. But would it make sense for someone in the SOAP infoset 
> to want to
> put in an Expect that alters HTTP behavior? How much of the 
> transport do
> we want to leak through into the SOAP infoset?
> 
> Upgrade - not useful to describe.  This duplicates WSDL 
> functionality of
> bindings where a client can request a different protocol for 
> subsequent
> interactions.
> 
> Via - not useful to describe.  This is a run-time informational report
> of what intermediaries a message has passed through.
> 
> Warning - not useful to describe.  
> 
> Allow - probably not useful to describe.   This seems to 
> duplicate some
> of WSDL functionality as this allows a server to indicate which
> operations are allowed on a given resource.  A service could generate
> Allow headers for resources that WSDL has described as there will be
> sufficient information in the WSDL binding to populate this header.
> This implies the client does not have the WSDL and so this seems of
> limited utility.
> 
> Content-Language, Content-Location, Content-MD5, Date, Etag, Expires,
> Last-Modified, Referrer, Retry-After - not useful to describe.
> 
> Server, User-Agent - probably not useful to describe.  It might be
> useful to have the software used to implement the Web service 
> identified
> in the messages.
> 
> Partial Content - not useful to describe.  
> 
> Content-disposition - probably not useful to describe.
> 
> PICS, P3P - not useful to describe.  These are straight Policy.
> http://lists.w3.org/Archives/Public/public-p3p-spec/2004Apr/0016.html
> 
> SoapAction - described in WSDL SOAP binding.  WSDL SOAP Binding
> SOAPAction property
> 
> Cookies - probably useful to describe.  Description could be used to
> indicate support or requirements for the use of HTTP stateful cookies.
> 
> WebDAV - probably not useful to describe.   Not used with Web services
> currently, though perhaps it should be.
> 
> Delta Encoding - ?
> 
>
Received on Monday, 10 May 2004 12:41:15 UTC