- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Thu, 16 Jan 1997 19:13:00 +0100 (MET)
- To: "M.T. Carrasco Benitez" <carrasco@innet.lu>
- cc: www-international@www10.w3.org
On Tue, 14 Jan 1997, M.T. Carrasco Benitez wrote: > 1) Defining a nomenclature that allows for translation cost little to > HTTP and could be very useful in translation. Example: > > it-ht (Italian, human translation) > it-mt (Italian, machine translation) This may be nice in some cases. But it should not be mandatory. > 2) The response to a translation request by machine or human would not be > instantaneous. Further work would be needed for longer transactions, > probably applicable to other fields. Does HTTP have something like a delay? Can a server/proxy send back "not available, but possibily available in 2 weeks"? > 3) "q" should be the "quality of the linguistic version" and not the > "user's preference for the language" (HTTP/1.1). Example > > q=1 Translated by a human Master Translator > q=0.5 Translated by a human Novice Translator > q=0.49 Translated by a machine Master Translator The q for the documents is quality of the documents. The q on the Accept-Language is the preference of the user. > 4) A standard nomenclature for "q" is need. For example, less than 0.5 > is machined translations. This would be widely contraproductive. The only restrictions we have is that we work with multiplication, so 0 is absolutely zero (and while there may be some need to specify a preference with q=0.0, a document with q=0 does not make any sense), and a server has to scale the q values for the calculation in a single query so that they fit into the 0..1 range. (note that scaling over the whole server is not needed, q values are relative to queries). As an example, consider (with serverwide scaled q values), a server only serving weather reports. Original weather reports will have q=1.0, translated ones maybe q=0.9, because machine translation of weather reports is quite reliable. On the other hand, consider a server for literate works. It will rate the literate works (novels, poems,...) itself as 1.0 (maybe not all of them :-), the general pages, lower in quality, maybe as 0.7, translations of general pages as 0.3, and machine translations of literary works maybe even as 0.1. > 5) The Accept-Language should be a ordered "preference list". There is no > need to quantify the preference of the user. Just to the contrary. If e.g. you know English, German, and Japanese, how do you express that you know Japanese almost as good as the others, or just a little bit? Depending on that, the documents you would prefer to be returned can differ greatly. The problem with q on Accept-Language is privacy. One part of this problem is the identification with some language minority, which may be done independently of q factors. The other is click tracing. For this, in certain cases even just the set of languages provides enough information. To alleviate the problem of click-tracing and privacy, in addition to the provisions in the http specs, it might be a good idea to agree to restrict the q values set by browsers to a limited set (e.g. 1.0, 0.8, 0.6, 0.4, 0.2, 0.0). This will allow a wide expression of relative preferences, while it will avoid click-tracing on something like "the guy that has Japanese at 0.4586794". Regards, Martin.
Received on Thursday, 16 January 1997 13:13:23 UTC