W3C home > Mailing lists > Public > xml-dist-app@w3.org > January 2003

RE: AFTF requirements, pre-2003/01/31 telcon

From: Martin Gudgin <mgudgin@microsoft.com>
Date: Fri, 31 Jan 2003 08:50:52 -0800
Message-ID: <92456F6B84D1324C943905BEEAE0278E02D30BF2@RED-MSG-10.redmond.corp.microsoft.com>
To: <jones@research.att.com>, <xml-dist-app@w3.org>

We would like to add another DR for discussion. This is essentially a
rewording of my earlier infoset related requirement in concrete form. I
will still be submitting a comment on the abstract feature spec.

DRXX - A message with all its parts, however separated physically, must
be representable as a single infoset and describable as a single XML
element in an XML schema.

Gudge

 


> -----Original Message-----
> From: Mark Jones [mailto:jones@research.att.com] 
> Sent: 30 January 2003 21:42
> To: xml-dist-app@w3.org
> Subject: AFTF requirements, pre-2003/01/31 telcon 
> 
> 
> 
> AFTFers,
> 
> This version of the requirements folds in Marc's compression 
> requirement, and inlines Jeff's proposed requirements (DR18, DR19, and
> DR20) and BEA comments/requirements from David Orchard in 
> preparation for the Friday, 2003/01/31 AFTF meeting.
> 
> --mark
> 
> Mark A. Jones
> AT&T Labs -- Strategic Standards Division
> Shannon Laboratory
> Room 2A02
> 180 Park Ave.
> Florham Park, NJ  07932-0971
> 
> email: jones@research.att.com
> phone: (973) 360-8326
>   fax: (973) 236-6453
> 
> 
> ________________________________________________________________
> 
> 
> Concrete Attachment Feature Requirements
> ----------------------------------------
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> * in the intro, define 'attachments' as 'a technology that 
> allows for the encapsulation of and reference to arbitrary 
> data, including that which is not legally serialized into XML 
> 1.0 (e.g., binary)'
> 
> * define 'parts' as 'units of arbitrary data'
> </davidO>
> 
> 
> Considerations
> --------------
> 
> * If existing packaging schemes (e.g., Multipart-MIME, DIME, ZIP, tar,
>   jar, etc.) meet the requirements, or represent sensible tradeoffs,
>   then the specification SHOULD use such existing schemes.
> 
> * The specification should, where reasonably practical, be 
> designed to 
>   facilitate debugging, tracing, and other diagnostic activities.
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> * The specification should aid message construction and 
> parsing with simple
>   tools.
> </davidO>
> 
> 
> General Requirements
> --------------------
> 
> R8. The specification must describe its relationship to the
>      properties defined in Table 1 (att:SOAPMessage and
>      att:SecondaryPartBag) in the SOAP 1.2 Attachment Feature
>      specification.
> 
> R9. The specification must describe its points of extensibility.
> 
> R15. The specification should not unnecessarily preclude convenient
>      description by languages such as WSDL.
>      [WSDL should have enough extensibility to handle reasonable
>      new attachment specifications include ours.  Our spec should
>      be reasonably describable by languages such as WSDL.]
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> R15. The specification should be conveniently describable by 
> languages such
>      as WSDL.
> </davidO>
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> DR24. The specification should include sample changes to WSDL 
> 1.2 and/or extensions to WSDL. </davidO>
> 
> R17. The specification must work with the SOAP 1.2 HTTP binding and
>      shouldn't unnecessarily preclude working with other bindings.
> 
> 
> 
> Representation
> --------------
> 
> R1. The specification must define a means to carry multiple data
>     parts.
> 
> R2. The specification must define a means for parts to carry
>     arbitrary data, including non-XML data (e.g., binary data and XML
>     fragments).
> 
> R3:  The specification should support efficient implementation of:
>      a) parsing the physical representation to separate and 
> identify its 
>         constituent parts.
>      b) programming systems which efficiently resolve a URI 
> to retrieve the 
>         data (and metadata) comprising the corresponding part.
> 
> R4.  The specification should use a reasonably space-efficient
>      representation.
> 
> DR5. The representation must efficiently support the addition and
>      deletion of parts.
> 
> <chris 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0025.html">
> Hmmm... While it is clear that an implementation of the 
> specification  
> would likely carry this requirement, it is less than clear that the  
> requirement is applicable to the specification itself. Further, one 
> would imagine that by this statement, it would be the 
> intended to cover the  
> insertion or in-line deletion of parts, or had you only 
> appending and  
> truncation in mind?  
>  
> Again, it isn't clear that this requirement, as written is either  
> testable of a specification or relevant for a specification 
> that is not  
> intended to be implementation-specific. 
> </chris>
> 
> <markJ 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0028.html">
> The point here was to make the spec relatively friendly to 
> intermediaries that might need to modify the attachment 
> bundle in straightforward ways.  (roughly resonant with the 
> fact that insertions and deletions of headers in a SOAP 
> envelope are pretty straightforward syntactically, for example). 
> </markJ>
>  
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> If that's the goal, then I think we need to specifically say:
> 
> (alternate) DR5. The representation SHOULD efficiently 
> support the addition and 
> deletion of parts by intermediaries.
> 
> Otherwise, I agree completely with Chris' concern.  Indeed, I 
> am somewhat 
> nervous that even at the intermediary the issues will be hard 
> to pin down, 
> and may relate to higher level constructs that we can't 
> control.  After 
> all, if you write an application that has to inspect the 
> whole message 
> before deciding what to insert of delete, then you almost 
> surely have to 
> buffer the whole thing at the intermediary.  Once you've done 
> that, then 
> Chris is right on even at the intermediary.  How can you tell 
> what is or 
> isn't efficient for me at such a buffering intermediary?  I've very 
> probably stored the parts in ways you wouldn't easily guess 
> (e.g. some 
> relational DB fields.)
> </noah>
> 
> 
> DR13. The specification must provide support for large parts.
> 
> <chris 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0025.html">
> And small ones as well one would imagine. How large? Arbitrarily  
> large? Just "pretty big", really, really large" or "incomprehensibly  
> large"? :)  
>  
> What about parts who's size is not known at the time that  
> the serialization is begun? 
> </chris>
> 
> <markJ 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0028.html">
> These points have been discussed briefly.  This one needs 
> more work. </markJ> 
> 
> <barton 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0030.html">
> The reason for this kind of requirement is the dominant 
> impact of I/O and memory allocation on performance.  For 
> small messages, all attachment scheme will be equal since 
> CPUs are infinitely fast. "Large" of course changes over time 
> as hardware resources improve. Design for messages between 
> 1MB and 1GB.  5 years from now, when this standard is in use, 
> allocators can bite off 1MB but 1GB will likely still call 
> for disk.  You can shift these numbers around, but they will 
> factor into the design: might as well discuss them explicitly.
> 
> In my opinion, parts whose size is not known should not be 
> "attached" to SOAP messages.  Rather one should use messages 
> to set up an out of band stream mechanism. </barton>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> I think the question with small is, do you care about 
> relative overhead? 
> Is it OK to add 200 bytes of overhead to a 5 byte attachment. 
>  In some 
> situations the answer is:  yes, the whole message is still only a few 
> hundred bytes and as John says, it's hard on modern 
> processors to get in 
> trouble processing a single small message.  On the other 
> hand, if you have 
> thousands of parts per message, or thousands of messages per 
> second, the 
> overhead can indeed really add up.  So, I don't think it's 
> obviously a 
> non-issue.
> </noah>
> 
> 
> DR21.  The specification should provide convenient means for 
> extending the 
> metadata carried with a message.  Such mechanisms should specifically 
> allow for extensions to the set of metadata associated with 
> individual 
> parts.
> 
> 
> DR22.  The specification should provide a means by which any 
> or all parts 
> MAY be labeled with associated MIME types.  (I.e. 
> applications sending a 
> message are not obligated to label parts with MIME types, but the 
> specification must provide for carrying the MIME type if provided.)
> 
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> DR25. The specification must provide specification of media 
> types for parts. </davidO>
> 
> 
> DR23.  The specification must be sufficiently 
> flexible/extensible to allow  
> for and describe transformations 
> (encoding/compression/encryption/...)  
> of parts.
> 
> <marcH>
> I was thinking along the lines of HTTP where you have a media 
> type plus  
> a transfer encoding. The same thing might be useful in the package:  
> this part is text/plain but is compressed using ... or this part is  
> text/plain but is encrypted using ..,
> </marcH>
> 
> 
> <jeff>
> DR18. The specification must define a means to format 
> messages for down-level receivers that do not understand the 
> specification. </jeff>
> 
> <sanjiva 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0034.html">
> How can any spec say something about those who don't 
> understand the spec? I'm confused. </sanjiva>
> 
> <barton 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0033.html">
> Maybe you can clarify this one Jeff...the way I read it, it 
> sounds impossible. </barton>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> I'm confused too.
> </noah> 
> 
> 
> <jeff>
> DR19. The specification must enable efficient allocation of 
> buffers by receivers. </jeff>
> 
> <sanjiva 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0034.html">
> I'm again confused; while a statement like "this spec must be 
> implementable as efficiently as possible" is reasonable (and 
> motherhood-and-apple-pie IMO), speaking specifically about 
> buffer allocation seems rather pointed. 
> </sanjiva>
> 
> <barton 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0033.html">
> This one motivates some of the other requirements but it 
> implies that the sender understand the receiver's memory 
> allocation capabilities. On one extreme the requirement could 
> amount to "give the content length of attachments up front", 
> but at the other extreme it could require the interleaving of 
> parts to achieve a serialization optimal for receiver processing.
> 
> As an example of the latter, the UPNP Printing folks worried 
> about how an extremely long XHTML doc with many inline images 
> could be a printed with one page buffer.  While that may seem 
> like an example far from the one most SOAP folks consider, 
> once you get to pipelined processing of composed
> 
> SOAP services the differences begin to fade.  These are cases 
> you want to be able to handle and they are cases that non-XML 
> systems deal with.
> 
> Of course the serialization of XHTML is well-defined.  
> Serialization for arbitrary receiver processing isn't.  That 
> makes this requirement difficult to spell out absent 
> information on the receiver buffer capability.  Consequently 
> one might go for a requirement that asks the spec. to allow 
> attachments to be placed in the stream physically near their 
> first point of XML reference rather than getting into 
> buffers. That would pick up the critical use case without 
> getting mired in an open-ended problem. </barton>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> I think we can say: "Attention should be given to likely 
> implementation 
> optimizations. I agree with Sanjiva, going much beyond that is too 
> specific.)
> </noah>
> 
> <barton>
> Sanjiva, the key words here are "by receivers".  The 
> serialization mechanism can have serious impacts on resource 
> constrained or heavily loaded receivers.  Emitting a SOAP 
> message in an HTTP-style MIME-like format without 
> content-length headers leaves the receiver with no  recourse 
> but multiple buffering layers and repeated dynamic memory 
> allocations as more content arrives.  For resource 
> constrained receivers, the result is late and annoying buffer 
> overflow; for heavily loaded receivers, the result is poor 
> performance.
> 
> This is, unfortunately not apple-pie since typically a 
> receiver-friendly protocol requires resources to be spent on 
> the sender, eg to count the bytes as the package is 
> assembled.  The specification will shift real costs.
> 
> Hope this helps clarify this issue.
> </barton>
> 
> <jeff>
> DR20. The specification must allow messages to be secured 
> using the mechanisms defined in WS-Security. </jeff>
> 
> <sanjiva 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0034.html">
> WS-Security only applies to SOAP envelopes. This requirement 
> would hence have the effect of precluding MIME/DIME style 
> packaging .. </sanjiva>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> +1
> </noah>
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> DR27. The specification should support securing of messages 
> and message parts, such as use of encryption and signatures, 
> in a simple manner.
> 
> This is different than the proposed "support ws-security 
> requirement", in that it covers application of encryption and 
> signature without necessarily meaning use of ws-security. </davidO>
> 
> 
> 
> Reference to Parts
> ------------------
> 
> DR6. The specification must permit parts to be identified by URIs.
> 
> <chris 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0025.html">
> Hmmm... I think that the specification should require that parts be  
> identified by URI, but that they may be identified using other means  
> as well. Of course, they could be identified by relative URI, 
> not just 
> absolute URI. 
> </chris>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> +1 except for the references to relative URI.  I think we want:  The
> specification must provide that each part be identified by an 
> (at least 
> one) absolute URI.
> 
> I think issues of relative should be above our level.  If some system 
> (e.g. SOAP itself) wants to provide base URI and resolve relatives to 
> absolute, that's fine, but we don't worry about that I think. 
>  I would not 
> want a part to be known at the deepest level as "../p".
> </noah> 
> 
> <markJ 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0028.html">
> We can consider your wording instead.
> </markJ> 
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> (alternate) DR6. The specification must permit parts to be 
> identified by URIs or URI References.
> 
> This is similar to ChrisF's comment.
> </davidO>
> 
> 
> DR7. The URI identification scheme must be robust under the addition
>      and deletion of parts -- i.e., it must not require that URIs to
>      other parts be altered, it must be relatively easy to avoid URI
>      conflicts, etc.
> 
> 
> DR11. (a) The specification should permit an initial human readable
>           part.
>       (b) The specification should not specify a particular ordering
>           of parts.
>       [still noodling on which version to prefer]
> 
> <chris 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0025.html">
> Not sure I follow this... 
> </chris>
> 
> <markJ 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0028.html">
> There was some sentiment for flexibility in part ordering -- 
> for example, having a text part preceeding even the SOAP 
> message. </markJ>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> Right.  I also think the notion of "initial" is fuzzy.  Is it 
> within the 
> first 100 bytes?  Is it no binary data between the start of 
> message and 
> this initial part (so you can use text tools to get that 
> far).  Does it 
> preclude interleaving?  I think this is too specific and we 
> should drop 
> it.
> </noah>
>  
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> preferred wording is (b)
> </davidO>
> 
> 
> DR12. The SOAP message part should be readily locatable/identifiable.
> 
> <chris 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0025.html">
> Should it not be the case that ALL parts be identified, 
> identifiable?  
> What would make the SOAP part unique in this regard? 
> </chris>
> 
> <markJ 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0028.html">
> We wanted to make sure if there were multiple SOAP message 
> parts that we could identify which one was the primary part 
> and which were attachments.  This may be an issue if order 
> were arbitrary, for example. </markJ>
>  
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> +1 but suggests
> 
> (alternate) DR12.  The primary (SOAP) message part should be readily 
> locatable/identifiable.
> 
> I think this correctly layers the packaging abstraction 
> (part) from its 
> use by SOAP.
> </noah>
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> (alternate) DR12. Any message parts should be readily 
> locatable/indentifiable. </davidO>
> 
> 
> DR16. The part identifier scheme to be determined by sending
>       application.
> 
> <chris 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0025.html">
> "scheme" seems to imply "URI", but my guess is that it does not.  
> Again, I would strongly recommend that parts be identified by URI  
> (relative or absolute).  
> </chris>
> 
> <markJ 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0028.html">
> URI is what I have in mind.
> </markJ>
> 
> <noah 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0037.html">
> No.  I think that URI schemes should be used according to their 
> definition.  This should not be a round-about way of enabling 
> the caching 
> scenario (if that's what's intended.)  Cachcing can be 
> enabled with a SOAP 
> feature (mapping an HTTP: URI to a CID:, for example).  The 
> part in the 
> message is unlikely to be correcly id'd directly with an HTTP 
> URI (unless 
> we're doing lazy pull through an http network.)
> </noah>
> 
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> DR26. The specification should support streaming of parts, ie 
> chunked encoding.  A sample scenario of this should also be 
> provided. </davidO>
> 
> <davidO 
> href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan
> /0044.html">
> DR28. The specification may provide manifest functionality. </davidO>
> 
> 
> 
Received on Friday, 31 January 2003 11:51:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:13 GMT