Re: prefer-language tag

Mark Crispin (MRC@CAC.Washington.EDU)
Wed, 18 Feb 1998 14:16:51 -0800 (PST)


Date: Wed, 18 Feb 1998 14:16:51 -0800 (PST)
From: Mark Crispin <MRC@CAC.Washington.EDU>
Subject: re: prefer-language tag
In-reply-to: <3.0.3.32.19980218170410.007899bc@mail.viagenie.qc.ca>
To: Marc Blanchet <Marc.Blanchet@viagenie.qc.ca>
Cc: ietf-languages@apps.ietf.org, ietf-charsets@INNOSOFT.COM
Message-id: <MailManager.887840211.21492.mrc@Tomobiki-Cho.CAC.Washington.EDU>

On Wed, 18 Feb 1998 17:04:10 -0500, Marc Blanchet wrote:
> Well, I was thinking of registering this new header as per registry
> proposal in the drums wg: draft-ietf-drums-MHRegistry-03.txt .  So where is
> the violation in this context?

The problem is a "layering violation".

RFC 822 headers are data for the end user's MUA.  It is not for the use of MTA
protocols (such as SMTP or NNTP).

> The intent is to get this information (user preferred language) up to the
> end point, because in each transit (for example mail relays in the SMTP
> context), one of them can send back a message about the delivery of the
> message, or other administrativia,...   (This discussion is valid for
> email, but can be different for other protocols).  We need to forward that
> information up to the end point.  Going through SMTP negotiation means that
> if one server is not able to understand the new lang command, then this
> information is lost up to the end.

Yes, I figured that out.

You are quite correct in your observation of the problems with doing it at the
SMTP negotiation level.  Your proposal would be a very elegant solution; and a
solution is badly needed.  Everybody recognizes this.

Unfortunately, the layering violation is a show-stopper.  It would get vetoed
by the email protocol guys; and if not by them by the IESG.  This is not the
first time that this has happened to an elegant solution to a problem which
urgently needs solving.  So don't feel bad!

This is ground that many of us have already covered.  Doing it in the RFC822
header constitutes an unacceptable layering violation; doing it as an SMTP
command may cause the data to get lost along the way.  What we (this includes
you) need to do is see if there are any other alternatives besides these two.

You figured out the first level of the problem on your own, I've updated you
to how far we've gotten on the second level, now let's see if we can come up
with any new insights for the third level.

> The intent was to make a standard tag, and then apply it to the various
> protocols, where the way it is done can be different. So, this draft was
> not only for SMTP.

That's a good concept and it is worth salvaging, even if the exact mechanisms
are discarded.

You may not have realized it, but you're creating a new concept.  It is not
like anything that currently exists, so it doesn't fit in with anything.

The new concept here is a "tag" that does not label its corresponding data;
nor does it provide other information for the recipient of the data.  Instead,
it provides control information for any new data that is to be returned to the
sender of the original data.

There isn't anything like this.  It's done by protocol-specific commands right
now.  Perhaps the time has come to use this type of tagging

In you want to advance this concept, you need to build a framework for it in
the various protocols.  You can't use the existing frameworks, because your
concept doesn't fit with them.

> Well, I don't think this is an issue in my draft: yes this problem is
> difficult (technically speaking), but it has been discussed in RFC 1766,
> which my draft is refering too. I think this dialect issue is more related
> to RFC 1766 than my draft.

Unfortunately, it has to be considered anew with your concept since new issues
are raised.  RFC 1766 simply labels data; it does not apply user preferences.
So there's no need to worry about how the dialects interact.

> Yeap you are right.  I had some ideas on this but I prefer to submit the
> basic idea before going into details.  There is always place for newer
> drafts and contributors! Would you contribute to this work?

Well, at the very least I can help with "reality checking" (as in telling you
what is likely to meet acceptance and what isn't).  I'm not yet convinced that
there is enough of an application for the concept, but time will tell; and it
might turn out to be the right thing.

The message I'm trying to convey is "you have the correct basic ideas; and the
concept of using a tagging architecture is interesting but as presented it's a
non-starter."  So, get rid of the presentation (using an RFC 822 header to
convey SMTP instructions) and see if it can advance in a different form.

I think that you should consider the question of being able to convey multiple
tag/value sets within the same token.  It is, for example, extremely common to
want to establish language, locale, and possibly also culture at the same
time.  A tagging architecure can be a benefit over commands if it can do
multiple tasks at a time.

You need to expand on the idea of a tagging architecture for protocol
operation control information, because you'll be building a new framework.
There are a number of people who've done this sort of thing before who can
help you.  It'll help a lot to develop a *very* thick skin, because "the
pioneers always end up with arrows in their backs."

In the case of this particular tag, you absolutely must detail how it
interacts with dialects.  As an implementor, I insist upon it.  Without a
precise specification to fall back on, implementors are left to guess, and
that leads to user confusion and anger.

Here's what I think the behavior should be:
1) If the user requests a "generic" form of the language, it will match either
   a server's "generic" form or a dialect of the server's choosing.
2) If the user requests a specific dialect of the language, it will match
   either that dialect on the server or a generic form offered by the server,
   but *NOT* any other dialect.

Expressed as a table, we have the following (including a couple of surprises):

		    What appears in the tag:
		FR-CA,FR,EN	FR-CA,EN	FR-FR		FR
Server has:	-----------	--------	-----		-----
FR-CA,FR-FR,EN	FR-CA		FR-CA		FR-FR		FR-CA
FR-CA,FR,EN	FR-CA		FR-CA		FR		FR
FR-FR,FR,EN	FR		FR		FR-FR		FR
FR-CA,EN	FR-CA		FR-CA		EN (!)		FR-CA
FR-FR,EN	FR-FR		EN (!)		FR-FR		FR-FR
FR,EN		FR		FR		FR		FR
EN		EN		EN		i-default	i-default

The surprises, marked by "(!)", came about because the user requested a
dialect that the server did not have, and the server did not offer a generic
form.

But, although this is technically the most reasonable and flexible answer, it
is not immediately obvious to anyone.  In fact, the behavior appears wrong at
first glace.

That's why it has to be specified.  Or there will be user confusion and anger.

It also leads to the conclusion that servers SHOULD offer a generic form for
all the languages it offers.  That would eliminate the two rows in which there
are surprises.  Clients can also avoid the surprise by always requesting the
generic form as well.

> Can you point me on the drafts or proposals?

First level required reading is RFC 2130.

A typical example in the IMAP protocol is draft-gahrns-imap-language-00.txt.
This particular proposal has *NOT* been adopted in the IMAP world (nor has it
been rejected); it's a good proposal, but it doesn't deal with the dialect
interactions and it doesn't deal with the inevitable question of locale.

The IMAP protocol should not specify the dialect interactions; it should defer
to a higher standard and concentrate only on the IMAP syntax.  Unfortunately,
that higher standard does not exist.

Similarly, if there's a tagging type architecture for language and locale,
IMAP should probably do something like have a "OPERATIONTAGS" command that can
carry arbitrary tags rather than a LANGUAGE command.  That'll address the
locale question.

> I know this is not an easy solution, and yes my draft is not intended to be
> the solution for all.  It is a proposal. I would be willing to work on this
> issue with any other proposal.

Well, I hope that I've given you some new things to think about.

There's several documents that can come out of this discussion:

1) A tagged architecture for conveying protocol operation preferences (not
   just languages).
2) [What your document started out as] An expansion of RFC 1766 languages tags
   to convey the concept of "preferred language", and how the preferences are
   prioritized (e.g. my dialect issue).
3) - n) How this framework is to be implemented in a particular protocol.


--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)