- From: Ned Freed <NED@innosoft.com>
- Date: Tue, 21 Nov 1995 12:03:24 -0800 (PST)
- To: asg@severn.wash.inmet.com
- Cc: elevinso@Accurate.COM, ietf-types@cs.utk.edu, uri@bunyip.com
> I need another MIME lesson. No problem ;-) > As I look at the draft proposed schemes for mid and cid URLs, a couple > of thoughts come up: > 1. By construction, these two nominal schemes are one scheme and we > should only use one name for them. MID or MIDCID are possibles. While its certainly possible to do this, I don't see why you'd want to. Message-IDs and Content-IDs are distinct entities. A given part of a message can have neither, one, or both of them. There is also the question of scope. I see support of message-ids as a cross-message sort of thing, preferably implemented as an index emcompassing the entire mailbox. (Preference would be given to whatever message is "current", of course.) Content-ids, on the other hand, are largely intended to be used within a single message. It therefore seems logical to give some indication of scope in the scheme identifier. I guess what I'm asking is what advantage you in collapsing the schemes into one. If there is a big one I guess I wouldn't mind making such a change. > 2. A more URL-traditional syntax would be something like > "mid:" //host-net-path/message-unique > where the RFC Message-ID object was <message-unique@host-net-path>. Ed is the expert on this one... > 3. The structure of a MIME multipart defines a well-ordered hierarchical > space where at each level there is a linear sequence of parts. We > could index into this part tree with part numbers. MIME message/partial > usage establishes the precedent that parts number from 1 and the > generic Internet URL syntax sets the precedent that paths punctuate with > '/'. If we simply extrapolate from these two boundary conditions, we > get a part URL more or less like this: > midcidurl ::= "mid:" //host-net-path/message-unique part-number > part-number ::= *( / decimal-integer ) > where the interpretation of the decimal integers is as follows: > For each level of hierarchy (defined separator) > 0 refers to the unPart before the first opening-separator > 1 refers to the Part after the first opening-separator > ... ; there are N parts and N opening-separators > n > N refers to the unPart after the closing-separator > opening-separator ::= "--" separator-key > closing-separator ::= "--" separator-key "--" > 4. To retrieve an object by its Content-ID, the usage > cidurl ::= "mid:" //host-net-path/message-unique?part-designation > part-designation ::= part-unique [@host-path-if-different] > where > Content-ID == <part-unique@(host-net-path | host-path-if-different)> > ; and I have not addressed encoding problems > is more consistent with general Internet URL usage than > introducing the Content-ID with '#'. In particular, for the > general pattern of URLs, the #fragment clause makes no difference > in the object that is served, only its state as presented. > Since the retrieval of a part by Content-ID only needs to get the > content of the part, the ? syntax which supports searching and can > affect the scope of the object retrieved is more consistent. > This usage would establish, as a rule under the mid: scheme, that > searching defaults to matching part-designation as constructed > above. The problem with all this is that this basic assumption is flawed: We could index into this part tree with part numbers. This assumes that from the time the message is composed to the time it is received and put in a message store nothing happens that could perturb the part structure. Example abound that contradict this. Some examples: (1) Security services may operate after the message is composed (in fact they have to) and may add additional encapsulation layers. Working around this leads to a hideous interaction where the agent constructing the core message has to know in advance what structure the security service is going to add. In addition, it presupposes that the agent constructing the message knows whether or not the the security wrapper will last throughout the life of the message. There is no way it can know this, since a receiver might elect to strip wrappers before storing or might elect not to, and might change its mind from time to time. (2) Transit through non-MIME systems may not preserve message structure. The most blatent example of this is exemplified by most of the LAN email systems, which can only handle a serious of unadulterated parts. Now, one can argue that content-ids won't be preserved either. In many cases this is true, but examples do exist of systems that are capable of preserving content-id information without preserving part structure. (3) MIME-MIME conversion gateways are becoming increasingly common. Such facilities preserve structure as a whole, but may elect to add parts, delete parts, and turn what used to be a single part into a multipart structure (usually multipart/alternative). We provide one of these in our products, as a matter of fact. The current version preserves structure unconditionally, and would that it could have stayed that way. Unfortunately the advent of various systems that produce wierd structures or bogus parts has forced us to implement structural manipulation primitives, and I'm sure other vendors will find themselves forced to do similar things. (4) An unforunate reality of some present-day X.400 and MIME agents is an inability to handle nested messages. (At least two popular MIME agents and several popular X.400 systems have this problem. How the X.400 systems managed to pass the conformance tests they claim to have passed and not support this is beyond me, but that's another story for another day.) This leads to situations where agents have no choice but to flatten out unnecessary message levels. This is one I'm especially aware of because not only do we deal with the broken agent, we also handle another agent that sends absolutely everything using nested message structures. Go figure. This can be accomodated by using nested-message-relative numbering (which you want anyway in order to allow forwaring without scanning the entire message and mucking with its content), but its still a problem. (5) Forwarding of messages by user agents in some cases disrupts the part structure. This can be intentional, when for example a user deletes a part from a forwarded message, or it can be unintentional, where agents simply don't handle forwarding properly. I can dig up more examples, but I think the point is clear -- messages are malleable things, and numbering schemes simply don't work very well with them. Labels, on the other hand, do. BTW, this discussion is pretty similar to the arguments for using line or byte counts instead of boundary markers for multipart structures. We elected to use a labelling approach there because of the malleability of messages, and the same logic applies here as well. Ned
Received on Tuesday, 21 November 1995 15:32:03 UTC