- From: David Orchard <david.orchard@bea.com>
- Date: Wed, 13 Feb 2002 15:33:46 -0800
- To: "'Takeshi Imamura'" <IMAMU@jp.ibm.com>, <reagle@w3.org>
- Cc: "'xenc'" <xml-encryption@w3.org>, <www-xenc-xmlp-tf@w3.org>, <xml-dist-app@w3.org>, "'Hiroshi Maruyama'" <MARUYAMA@jp.ibm.com>
Takeshi, Joseph, et al, In this note I will focus on one particular issue, context information about an XML Document and its contents. I'd like to point out that the TAG is currently examining the issue of dispatch based upon content-type, namespace names, and root element names [1]. Further, the TAG has an outline of an architecture document that defines a generic dispatch model based upon content-type and namespaces. There is a proposed algorithm contained therein[2]. The TAG has not reached consensus yet on these issues though. There are a number of aspects of the TAG discussions that may be relevent, particularly the registering of content-types and dispatch mechanisms. I had hoped that the TAG would be able to publish a finding on dispatch based upon our F2F meeting on Tuesday Feb 12th, but we need some more tweaking on the wording. I do feel comfortable saying that registeration of a content type for an xml vocabulary that may be embedded within another and is intended to be dispatched on - like xenc and xslt - is likely to be an outcome. My principle concern on the namespace name issue is to ensure that an XML document with it's content information is accurate and self-describing. The net is I have 2 proposals for XMLE and would like to entertain discussion on 2 others: 1. XML Encryption registers a content-type, perhaps application/xmlencrypted. 2. XML Encryption say wording to the effect that a document containing a vocabulary that also contains encrypted content where the decryption is required to make the instance conform with the vocabulary namespace name, MUST provide some metadata to indicate decryption is required. Changed namespace names, content-type (this is my step #7 in algorithm following) , SOAP mustUnderstand intermediaries (step #6 is algorithm following) are valid examples of this. 3. The XMLE group entertain the definition of algorithms for dispatch based upon content-type, namespace names, and/or other metadata. It may be that this is an XMLE/XMLP liason issue or a web services architecture wg issue though. 4. The XMLE/XMLP liason group discuss the definition of a metadata solution to problem described in proposal #2 or ask the wswag to suggest/allocate resources or wgs to define such a solution. Given the lack of traffic on xenc-xmlp, we may wish to raise to wsawg. But maybe this note will spawn some traffic ;-) I find your option #4 to be interesting and worthwhile pursuing and very appropriate for SOAP, though I suggest a different approach that allows decryption to occur without the use of a decrypting actor. The solution of creating metadata is exactly the right path to go down. We could try to merge the content-type for messages concept with the content-type for header blocks. The use of a content-type for the whole message solves this problem for messages outside the SOAP context, so I think we could apply this to headers. What I think we might need is a "content-type" for the block content. For now, my thought is that a new SOAP attribute of content-type would be sufficient. This would contain a content-type values as defined by MIME content-type. BTW, this may have nice side effects as charset information could be encoded in the content-type. This also should work well with existing infrastructure to do content-type dispatch. Taking a look at the header more closely, my proposal is that the header is labeled as a content-type application/xmlencryption, ie <header><po:po xmlns:po="..." soap:content-type="application/xmlencryption" ><xenc:encrypteddata/></po:po></header>. In this fragment, the po namespace name knows nothing about the encrypted contents. Decryption of the xenc block is required to reconstruct the po. This is very similar to the case where a message has content type application/xmlencryption and containing a PO that also contains encrypted data. Arguably the content-type concept is not a SOAP concept, more of an XML concept brought on by mixing namespaces. But I believe it belongs in SOAP because SOAP has an explicit message path model that may require decryption/encryption at various points in the message path AND we wish to keep SOAP as content-type application/soap. There is a 5th option that I propose to add to your list of options. 5. The namespace name is left as is, and a Content-type of application/xmlencryption is used for messages and/or blocks. A receiver then dispatches to an decrypting service for decryption. Presumably this would know which encrypted elements to decrypt, and as well how to dispatch to the next node. I see a rough algorithm of dispatch relating to XMLE: 1. If the entire message is encrypted and sent then the root element will be changed to the encryption element. The first receiving node can do dispatch based upon the root element or the encryption namespace name. 2. If a portion of a message is encrypted, and the message cannot be interpreted without decryption first, then a content-type of application/xmlencryption must be used. The namespace name can be left as the unencrypted namespace name. The receiving node dispatches based upon the content-type. Your second example fits here. This technique has been used for compression as well, with content type application/gzip used. 3. If the encryption is baked into the application context, then the applications namespace name accurately reflects the fact that the application "understands" encryption. Dispatch based upon namespace name or root element is possible. 4. In the SOAP context, decryption of the content of a header may be reqired before the node can process it. In this case, complete encryption of the header contents is done. The header handler dispatches to the decryptor based upon the header blocks "root element", which happens to be an xmle element. This is similar to my algorithm element #1. This does not use an explicit actor. Hiroshi's example of SOAP-SEC body does NOT follow this model [3] as it uses an explicit actor. <header><xenc:encrypteddata/></header> is a sample. 5. In the SOAP context, the encryption may be baked into the header context. Dispatch is based upon the namespace name or root element. This is similar to my algorithm element #3. <header><encryptionawareheaderns:foo/></header is a sample 6. SOAP dispatch is done as per the actor attribute. The intermediary specified by the actor will know what to do with any encryption/decryption required, including other portions. Hiroshi's example of SOAP-SEC header follows this model [3]. <header><SOAP-SEC:Encryption actor="some-URI" mustUnderstand="1"/></header> 7. In the SOAP context, there is support for a descendent of a header block to be encrypted and decryption is required at the header node before handling is possible IFF meta-data is supplied. A header without some meta-data attribute(s) to indicate decryption processing for cases of a descendent needing decryption is NOT allowed. Thus <header><po:po xmlns:po="..."><xenc:encrypteddata/></po:po></header> is NOT allowed where the encrypteddata is not understood by the po vocabulary. I've suggested that a content-type field would be useful here, so <header><po:po xmlns:po="..." soap-env:content-type="application/xmlencrypted"><xenc:encrypteddata/></po:p o></header> would be sufficient to do dispatch. This scenario allows intermediaries to be built that do not require an explicit actor to do decryption, ala step #6 and the soap-sec example. Targetting decryption actors worries me as we start adding many different types of headers, some of which may be encrypted and some not. This eerily feels like targetting a schema validator. Step #7 is the key example that I have been concerned with wrt namespace names. It is almost exactly like the xslt example where xhtml is the containing element, yet xhtml processor cannot understand the xslt content. Dispatch to an XSLT processor is required, but this can't be done by the namespace name nor the root element. Hence the discussion about content-types. In example #7 dispatch to the decrypting software is required, but the software can't determine this from the po header, it has to *magically* know based upon the presence of an xenc namespace name inside! Another algorithm would be to define something in the XMLE namespace along the lines of "secured", but that would percolate xmle into the header block. Cheers, Dave [1] http://lists.w3.org/Archives/Public/www-tag/2002Jan/0177.html [2] http://www.w3.org/2001/tag/doc/toc [3] http://lists.w3.org/Archives/Public/www-xenc-xmlp-tf/2001Dec/0001.html > -----Original Message----- > From: Takeshi Imamura [mailto:IMAMU@jp.ibm.com] > Sent: Tuesday, February 12, 2002 8:20 AM > To: David Orchard; reagle@w3.org > Cc: 'xenc'; www-xenc-xmlp-tf@w3.org; xml-dist-app@w3.org; Hiroshi > Maruyama > Subject: Re: XMLP/XMLE Use cases and processing models > > > > > >> Questions > >> ---------- > >> 1. I have anticipated that the intermediary has to know about the > >> receiver schema and must expose the "merged" schema or a > schema without > >> the unencrypted credit card info to senders. There is > tight coupling > >> there, but how else would the intermediary know which > messages and which > >> portions to decrypt? Is this valid or does XMLE expect that a SOAP > >> encrypting/decrypting intermediary would not know about > the receiver > >> schemas nor would it expose a "merged" schema? This > assumes the first > or > >> second solution in the processing requirements (3.2.2) [1]. > > > >I'd be interested in some of the xenc implementors thoughts > on this since > >they best know how they want to process the data. However, given my > >ignorance I could see a few options: > > > >1. The namespace is changed to indicate the change of the > instance. This > >means you will have to have a namespace for every encrypted > variant. This > >is probably fine for some applications (e.g., they know they > only care > >about encrypting the credit-card data and nothing else so > creating a new > >namespace and schema isn't very difficult.) Others may care > less about > >validation and more about flexibility. > >2. One "pre-scan's" the document looking for xenc:* > elements. For example, > >during parsing (e.g., XNI [1]) one could flag such instances > that trigger > >subsequent action. > >3. The encryption is "baked" in to the application context. > For instance, > >if you know that you'll be sending credit card data over an > open network > >there needn't be choice between the credit card data in the > plain and the > >encrypted form. The credit card is *always* sent encrypted and the > >recipient is always decrypted at the other end. > >4. Meta-data is used to indicate the some of the data has > been encrypted. > >For instance, to make option 3 a little more flexible, one > could create a > >SOAP confidentiality header that indicates a decryptor actor with > >mustUnderstand="1". > > > >In these options there are two issues: (1) how to know if > parts of the > >document have been encrypted, (2) how to know which agent is > supposed to > do > >the decryption. > > These issues are not only ones of XML Encryption but can be > ones of others. > So we should address them as generally and extensiblly as > possible. From > such a point of view, I think option 4 (e.g., [1]) looks the most > reasonable. > > [1] > http://lists.w3.org/Archives/Public/www-xenc-xmlp-tf/2001Dec/0001.html > > Thanks, > Takeshi IMAMURA > Tokyo Research Laboratory > IBM Research > imamu@jp.ibm.com > > >
Received on Wednesday, 13 February 2002 18:38:25 UTC