AFTF requirements, pre-2003/02/03 telcon from Mark Jones on 2003-02-03 (xml-dist-app@w3.org from February 2003)

From: Mark Jones <jones@research.att.com>
Date: Mon, 3 Feb 2003 09:48:37 -0500 (EST)
To: xml-dist-app@w3.org
Message-Id: <200302031448.JAA22347@bual.research.att.com>
AFTFers,

This version of the requirements reflects discussion in the
Friday, 2003/01/31 AFTF meeting, and email received over the
weekend.

Our current scorecard is:
  16 requirements agreed (R8, R9, R15, R24, R17, R1,
         R2, R3, R4, R5, R13, R30, R21, R31, R22, R18)
  11 requirements not yet discussed (DR19, DR20, DR27,
         DR29, DR6, DR7, DR11, DR12, DR16, DR26, DR28)

--mark

Mark A. Jones
AT&T Labs -- Strategic Standards Division
Shannon Laboratory
Room 2A02
180 Park Ave.
Florham Park, NJ  07932-0971

email: jones@research.att.com
phone: (973) 360-8326
  fax: (973) 236-6453


________________________________________________________________


Concrete Attachment Feature Requirements
----------------------------------------

The terminology used in this document is intended to be consistent
with that found in the SOAP 1.2 Abstract Feature specification
[http://www.w3.org/TR/2002/WD-soap12-af-20020814/].


Considerations
--------------

* If existing packaging schemes (e.g., Multipart-MIME, DIME, ZIP, tar,
  jar, etc.) meet the requirements, or represent sensible tradeoffs,
  then the specification SHOULD use such existing schemes.

* The specification should, where reasonably practical, be designed to
  facilitate message construction, parsing, debugging, tracing, and
  other diagnostic activities.


General Requirements
--------------------

R8. The specification must describe its relationship to the
     properties defined in Table 1 (att:SOAPMessage and
     att:SecondaryPartBag) in the SOAP 1.2 Attachment Feature
     specification.

R9. The specification must describe its points of extensibility.

R15. The specification should be conveniently describable by languages such
     as WSDL.
     [WSDL should have enough extensibility to handle reasonable
     new attachment specifications include ours.  Our spec should
     be reasonably describable by languages such as WSDL.]

R24. The specification should include sample changes to WSDL 1.2 and/or
     extensions to WSDL.  [Should this be decided by the WSCG?]

R17. The specification must work with the SOAP 1.2 HTTP binding and
     shouldn't unnecessarily preclude working with other bindings.



Representation
--------------

R1. The specification must define a means to carry multiple data
    parts.

R2. The specification must define a means for parts to carry
    arbitrary data, including non-XML data (e.g., binary data and XML
    fragments).

R3:  The specification should support efficient implementation of:
     a) parsing the physical representation to separate and identify its 
        constituent parts.
     b) programming systems which efficiently resolve a URI to retrieve the 
        data (and metadata) comprising the corresponding part.

R4.  The specification should use a reasonably space-efficient
     representation.

R5. The representation should efficiently support the addition and
     deletion of parts by intermediaries.

R13. The specification must provide support for arbitrarily large
      parts.

R18. The specification must define a mapping between the attachment
     representation and a standalone SOAP message.  For example, this may aid
     down-level receivers that do not understand this specification.


<jeff href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0032.html">
DR19. The specification must enable efficient allocation of buffers by
      receivers.
</jeff>

<sanjiva href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0034.html">
I'm again confused; while a statement like "this spec must be
implementable as efficiently as possible" is reasonable (and
motherhood-and-apple-pie IMO), speaking specifically about 
buffer allocation seems rather pointed. 
</sanjiva>

<barton href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0033.html">
This one motivates some of the other requirements but it implies that
the sender understand the receiver's memory allocation capabilities.
On one extreme the requirement could amount to "give the content
length of attachments up front", but at the other extreme it
could require the interleaving of parts to achieve a serialization
optimal for receiver processing.

As an example of the latter, the UPNP Printing folks worried about how
an extremely long XHTML doc with many inline images could be a printed
with one page buffer.  While that may seem like an example far from
the one most SOAP folks consider, once you get to pipelined processing
of composed

SOAP services the differences begin to fade.  These are cases you want
to be able to handle and they are cases that non-XML systems deal
with.

Of course the serialization of XHTML is well-defined.  Serialization
for arbitrary receiver processing isn't.  That makes this requirement
difficult to spell out absent information on the receiver buffer
capability.  Consequently one might go for a requirement that asks the
spec. to allow attachments to be placed in the stream physically near
their first point of XML reference rather than getting into buffers.
That would pick up the critical use case without getting mired in an
open-ended problem.
</barton>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0037.html">
I think we can say: "Attention should be given to likely implementation 
optimizations. I agree with Sanjiva, going much beyond that is too 
specific.)
</noah>

<jeff href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0055.html">
I can live with this, but as John Barton points out, there is a
specific efficiency concern associated with unbounded buffering
required by the receiver.
</jeff>

<barton>
Sanjiva, the key words here are "by receivers".  The serialization
mechanism can have serious impacts on resource constrained or
heavily loaded receivers.  Emitting a SOAP message in an
HTTP-style MIME-like format without content-length headers leaves
the receiver with no  recourse but multiple buffering layers and repeated
dynamic memory allocations as more content arrives.  For resource
constrained receivers, the result is late and annoying buffer overflow;
for heavily loaded receivers, the result is poor performance.

This is, unfortunately not apple-pie since typically a receiver-friendly
protocol requires resources to be spent on the sender, eg to count
the bytes as the package is assembled.  The specification will
shift real costs.
</barton>

<jeff href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0052.html">
John, well put. I hope the AFTF agrees. --Jeff
</jeff>



<jeff href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0032.html">
DR20. The specification must allow messages to be secured using the
mechanisms defined in WS-Security.
</jeff>

<sanjiva href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0034.html">
WS-Security only applies to SOAP envelopes. This requirement would
hence have the effect of precluding MIME/DIME style packaging ..
</sanjiva>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0037.html">
+1
</noah>

<jeff href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0055.html">
It is not at all clear that using WS-Security precludes MIME or DIME
style packaging. WS-Security applies to an Infoset, and MIME and DIME
may (or may not) end up being a serialization of the Infoset.

As David has pointed out, we must define how to secure messages; it
would seem unnatural for us to not reference the emerging Web Services
security technology.
</jeff>



<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0044.html">
DR27. The specification should support securing of messages and message
parts, such as use of encryption and signatures, in a simple manner.

This is different than the proposed "support ws-security requirement", in that it
covers application of encryption and signature without necessarily meaning
use of ws-security.
</davidO>



<gudge href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0051.html">
DR29.  A message with all its parts, however separated physically, must
be representable as a single infoset and describable as a single XML
element in an XML schema.
</gudge>

<sanjiva href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0056.html">
Is this more a WSDL level requirement or a packaging requirement? If
its the latter, isn't it basically saying the packaging must be a
single XML element?

Even if the serialization of each of the parts is in XML, why do you
want to preclude the following model:
    <soap:envelope>
      <soap:body>
        <the main thing goes here/>
        <"attachment" 1 goes here/>
        <"attachment" 2 goes here/>
        ...
      </soap:body>
    </soap:envelope>

Or is this kind of packaging supported in your requirement? (I can't
tell.) Does it preclude a MIME (e.g., SwA) packaging?
</sanjiva>

<gudge href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0057.html">
> Is this more a WSDL level requirement or a packaging 
> requirement? 

I think you could argue that the second clause of the sentence is a WSDL
requirement.

> If its the latter, isn't it basically saying the 
> packaging must be a single XML element?

I do not see 'representable as a single infoset' as meaning 'packing
must be a single XML element'

> 
> Even if the serialization of each of the parts is in XML, why 
> do you want to preclude the following model:
>     <soap:envelope>
>       <soap:body>
>         <the main thing goes here/>
>         <"attachment" 1 goes here/>
>         <"attachment" 2 goes here/>
>         ...
>       </soap:body>
>     </soap:envelope>
> 
> Or is this kind of packaging supported in your requirement? (I can't
> tell.) 

I believe the requirement allows the above ( the single XML element
would in this case be either soap:Body or soap:Envelope ).

> Does it preclude a MIME (e.g., SwA) packaging?

I do not believe that this requirement precludes any particular
packaging scheme, per se.
</gudge>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0059.html">
I have a concern with this proposed requirement.  First of all, I think it 
is really proposing a change to the SOAP 1.2 Attachment Feature WD [1], 
and not directly to the implementations for which we are gathering 
requirements.   [1] is fairly clear that attachments are to be named with 
URIs and accessed using the normal mechanisms of the web (though the 
actual resolution of the URIs, such as CID:, is presumably provided by the 
concrete embodiment of the feature in DIME, S+A or whatever.)

Furthermore, I prefer the status quo in [1].  I think the sorts of 
information we are trying to carry are best typed with MIME types;  I 
believe that the URIs that refer to those attachments fit comfortably into 
the Envelope infoset (as xsd:anyURI elements and/or attributes), but that 
the resource representations themselves do not.  I want to be able to be 
able to say:  "this part is of type image/gif".

...

I think we have a pretty good data model for attachments, and it's the Web 
model not an XML infoset.  The XML Infoset is the SOAP envelope.  It can, 
using the usual mechanisms of the Web, make references to resources using 
URIs.  Some of those resources (or representations of them) will be 
physically packaged with the message, and those we call attachments.  In 
the cases of interest (as opposed to mailto: URIs), the resources should 
be capable of providing MIME-typed representations of themselves using the 
normal mechanisms of the Web.  So, when the URIs are http: URIs, the 
resources are (probably) not thought of as attachments and are retrieved 
using the normal mechanisms of http:.  Each particular packaging scheme, 
as described in [1], defines the means by which it uses some particular 
set of URIs for retrieval of representations of attachments.

That's it.  I think it's a reasonable model.  I think WSDL can model it. 
Indeed, I think WSDL needs sooner or later to support this model for 
non-attachment data.  Applying it to attachments is just more metadata, I 
think (this URI will refer to a resource that travels with the message, 
this one won't, and this third one could be either way.)

I really haven't seen either a motivation or an architecturally strong 
design for including image/gif data in an XML Infoset.  Actually, let me 
soften that.  I think the data model given 2 paras above is the right one 
for users.  If someone wants to do a second packaging that uses the XML 
Schema hexBinary or base64Binary and that puts the parts in SOAP headers, 
expanded to character form, I think that might be worth considering.  It's 
not a solution, IMO, to the requirement that we carry binary as binary, 
which is what we're supposed to do here. 

I am very much opposed to any proposal that directly or indirectly creates 
a binary data model in the Infoset at this time.  I think it's a very 
subtle thing to get right, it needs to be very carefully lined up with at 
least the query data model, it breaks a lot of the things we hold dear 
about XML as a text standard, and I certainly don't think it's something 
we should back into in the course of doing attachments.
</noah>

<chris href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0060.html">
+1
</chris>

<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0061.html">
> That's it.  I think it's a reasonable model.  I think WSDL
> can model it.
> Indeed, I think WSDL needs sooner or later to support this model for
> non-attachment data.  Applying it to attachments is just more
> metadata, I
> think (this URI will refer to a resource that travels with
> the message,
> this one won't, and this third one could be either way.)
>

This is part of the problem, imo.  What the WSDL modelling is should be
known rather than supposed.  If it turns out that WSD modelling is quite
onerous, than that doesn't meet the simplicity requirement.

I fully expect that any solution will also address WSD modelling.

I also expect that part of the trade-off on solution selection will include
how the WSDL modeling differs, with preference for simpler.  To me, a key
requirement is "simple WSDL modelling".

Each and any solution is a trade-off on requirements - including the web
btw - and should take into account relevent requirements.  WSD modelling is
a relevent requirement.
</davidO>

<richSalz href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0062.html">
What's the Infoset description of an external XML message with its own DTD?

What's the rationale for wanting to impose the Infoset model on
anything someone might want to reference from a SOAP message?  What kinds
of things do you think would be gained and lost from this approach?

Without knowing more details, I don't feel comfortable saying more than
this doesn't seem like the right thing to do.  (Well, okay, it makes me
want to hurl, to be more accurate.)  But I would like to know the answers
to my questions above.

In another message, Rich also gives +1 to not creating a binary data
model for the Infoset.
</richSalz>

<chris href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Feb/0000.html">
As I understand it, this requirement would seem to preclude the ability to 
carry an XML document in a message. 

Quoting from the XML Infoset spec:

"There is exactly one document information item in the information
set, and all other information items are accessible from the
properties of the document information item, either directly or
indirectly through the properties of other information items."

Suppose I want to offer a Web service that performed spell-checking of
documents.  This requirement would preclude this sort of service so it
would seem. In fact, it would seem to preclude any service that
operated upon a document.
</chris>



Metadata
--------

R21.  The specification should provide convenient means for extending the 
      metadata carried with a message.

R31.  The specification should provide convenient means for extending the 
      metadata associated with individual parts.

R22.  The specification should provide a means by which any or all parts 
      MAY be labeled with associated MIME types.  (I.e. applications sending a 
      message are not obligated to label parts with MIME types, but the 
      specification must provide for carrying the MIME type if provided.)

R30. The specification must provide an optional facility for specifying
     part size in advance.


Reference to Parts
------------------

DR6. The specification must permit parts to be identified by URIs.

<chris href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0025.html">
Hmmm... I think that the specification should require that parts be  
identified by URI, but that they may be identified using other means  
as well. Of course, they could be identified by relative URI, not just 
absolute URI. 
</chris>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0037.html">
+1 except for the references to relative URI.  I think we want:  The 
specification must provide that each part be identified by an (at least 
one) absolute URI.

I think issues of relative should be above our level.  If some system 
(e.g. SOAP itself) wants to provide base URI and resolve relatives to 
absolute, that's fine, but we don't worry about that I think.  I would not 
want a part to be known at the deepest level as "../p".
</noah> 

<markJ href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0028.html">
We can consider your wording instead.
</markJ> 

<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0044.html">
(alternate) DR6. The specification must permit parts to be identified by URIs or URI
References.

This is similar to ChrisF's comment.
</davidO>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0045.html">
I am a little surprised.  I would have thought that what we want is:

* The identity of each part is a URI (I.e. an absolute URI)

* References to parts are in the form of URI references (which are 
resolved through the usual mechanisms to yield the absolute URI).

David:  are you really saying that you want to allow "../a" as the 
identity of a part?  Thanks.
</noah>

<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0047.html">
../a has nothing to do with URI References vs URIs.  ../a is allowed by URIs
and by URI references.  You might be thinking of absolute URIs however :-)

URI References are URIs that may have fragments.  Oh darn, we don't have a
term for a URI that has an absolutized portion that may have fragments.
</davidO>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0048.html">
I stand corrected.  You're right of course.  Still, I would think that we 
want to follow web architecture.  As far as I know, that means that the 
resource which is a part should be identified by an absolute URI (not 
relative, NO fragment ID.)  References to the part as a whole should allow 
relative and absolute forms.  References within parts that have known 
media type should allow URI References, including fragment ID.

Bottom line:  a part is named by an absolute URI.  References are in the 
form of URI references, but Fragid is a reference within the part. 
Specifically, two references that differ only in their fragid must resolve 
to the same part.

Also:  on the phone call I suggested a requirement that the attachment 
implementation be capable of carrying a media type for each part.

David:  does this sound right?
</noah>

<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0049.html">
Web architecture doesn't stipulate absolute URIs.  I would like to allow
frag ids, specifically so that parts could actually be fragments within an
xml document.  One example would be a soap with attachments package that
contains 2 xml documents, and the first refers to a part that is within the
2nd xml document.  I expect that in most cases, people would use absolute
URIs, but I can think of scenarios where they would want a fragment.  Let's
make this a bit more concrete.  I want to chunk a large xml document.  Say I
decide to split this into 2 documents. I could use an xinclude in the first
to refer to the 2nd, and I have an application that reads the first chunk,
then afterwards resolves the xinclude.  As XML requires a root note, the
XInclude has to point to a fragment in the 2nd document, specifically all
the children of the root node.

Now if a new version of XML allowed xml to not have a root node, like
external entities, this might be solved. :-)

I absolutely agree with carrying the media type.  Violently in fact.  These
documents, and parts, must be correctly self-describing.  Now that's web
architecture!
</davidO>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0050.html">
>>  I would like to allow frag ids, specifically 
>> so that parts could actually be fragments within an
>> xml document.  One example would be a soap with 
>> attachments package that contains 2 xml documents, 
>> and the first refers to a part that is within the
>> 2nd xml document. 

Hmm.  This is an interesting idea, and I can see the merits.  On the other 
hand, don't we then lose the ability for the parts themselves to have a 
MIME type and for fragments to reference within the parts?  I wonder 
whether that isn't the more important use case.  I'm nervous about trying 
to allow both at the same time.  Does the web even allow:  xxxx#a#b  to 
reference a piece of a part that is itself within an XML document? 

I think the design point for parts is only secondarily XML within XML, I 
think it's primarily non-XML data, and I think MIME types are the obvious 
web-compatible way to handle that.   I think it's important that 
attachments are just web resource (or at least representations of web 
resources) that happen to travel with the messages.  I'm not sure your 
proposal is compatible with that view.
</noah>

DR7. The URI identification scheme must be robust under the addition
     and deletion of parts -- i.e., it must not require that URIs to
     other parts be altered, it must be relatively easy to avoid URI
     conflicts, etc.


DR11. (a) The specification should permit an initial human readable
          part.
      (b) The specification should not specify a particular ordering
          of parts.
      [still noodling on which version to prefer]

<chris href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0025.html">
Not sure I follow this... 
</chris>

<markJ href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0028.html">
There was some sentiment for flexibility in part ordering -- for
example, having a text part preceeding even the SOAP message.
</markJ>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0037.html">
Right.  I also think the notion of "initial" is fuzzy.  Is it within the 
first 100 bytes?  Is it no binary data between the start of message and 
this initial part (so you can use text tools to get that far).  Does it 
preclude interleaving?  I think this is too specific and we should drop 
it.
</noah>
 
<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0044.html">
preferred wording is (b)
</davidO>


DR12. The SOAP message part should be readily locatable/identifiable.

<chris href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0025.html">
Should it not be the case that ALL parts be identified, identifiable?  
What would make the SOAP part unique in this regard? 
</chris>

<markJ href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0028.html">
We wanted to make sure if there were multiple SOAP message
parts that we could identify which one was the primary part and which
were attachments.  This may be an issue if order were arbitrary, for
example.
</markJ>
 
<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0037.html">
+1 but suggests

(alternate) DR12.  The primary (SOAP) message part should be readily 
locatable/identifiable.

I think this correctly layers the packaging abstraction (part) from its 
use by SOAP.
</noah>

<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0044.html">
(alternate) DR12. Any message parts should be readily locatable/indentifiable.
</davidO>


DR16. The part identifier scheme to be determined by sending
      application.

<chris href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0025.html">
"scheme" seems to imply "URI", but my guess is that it does not.  
Again, I would strongly recommend that parts be identified by URI  
(relative or absolute).  
</chris>

<markJ href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0028.html">
URI is what I have in mind.
</markJ>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0037.html">
No.  I think that URI schemes should be used according to their 
definition.  This should not be a round-about way of enabling the caching 
scenario (if that's what's intended.)  Cachcing can be enabled with a SOAP 
feature (mapping an HTTP: URI to a CID:, for example).  The part in the 
message is unlikely to be correcly id'd directly with an HTTP URI (unless 
we're doing lazy pull through an http network.)
</noah>


<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0044.html">
DR26. The specification should support streaming of parts, ie chunked
encoding.  A sample scenario of this should also be provided.
</davidO>

<marcH href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0053.html">
Isn't chunking is a solution to streaming rather than a requirement ?
</marcH>

<noah href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0058.html">
Agreed.  Actually, I think it may be viewed as a solution to interleaved 
streams, in which more than one stream makes progress at a time, perhaps 
in a manner that's correlated at the application level (e.g. television 
frames with metadata about each interleaved.)  I've always been a bit 
nervous that SOAP isn't well engineered to facilitate this.  I think it 
basically didn't make the 80/20 cut.  I'm very much on the fence whether 
it's a good requrirement to adopt now, as I suspect that doing it only at 
the attachment level begs a lot of questions about the higher level 
abstraction supported (which is really your point, I think.) Thanks.
</noah>


<davidO href="http://lists.w3.org/Archives/Public/xml-dist-app/2003Jan/0044.html">
DR28. The specification may provide manifest functionality.
</davidO>
Received on Monday, 3 February 2003 09:49:13 UTC