- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Sat, 8 Mar 1997 15:33:07 +0100 (MET)
- To: "M.T. Carrasco Benitez" <carrasco@innet.lu>
- cc: lee@sq.com, unicode@unicode.org, www-international@w3.org
On Sat, 8 Mar 1997, M.T. Carrasco Benitez wrote: > > As of the definition in RFC 2070, the exact meaning of <HTML LANG=xxx> > > is that everything not marked to be in any other language is xxx. > > This can range from the whole document being in xxx to documents > > that contain not a single word in xxx. The later case does not > > make much sense in practical terms, but is perfectly legal > > according to RFC 2070. > > Yes. But does it make sense to give some more "semantics" to this > syntax ? It looks like you are giving more semantics, but you are actually changing semantics. Currently, if I write <HTML LANG=en>, this does not mean that the document is monolingual, or that the document is more than 50% English, or whatever. Changing semantics is much more of a problem than adding semantics. > > A general comment: > > > > As we have seen in this discussion up to now, there are many > > different needs for language information about documents. > > > > Proposals for one specific interpretation of one already > > well-defined way to indicate language in a HTML document, > > to satisfy one specific information need that appeared at > > one place are not a long-lasting approach to solving the > > information needs we have. > > > > I would suggest to attack the problem in a wider frame, > > e.g. to look at Metadata (DC or other) and see how this > > can be used to satisfy the various needs already expressed > > and the many more that will appear in the future. > > Does it make sense the approach in the present draft: "Natural language > marking in HTML" or should we approach it from another angle ? > > I am aware that the proposal is very limited: just a clarification of the > existing syntax and some additonal "semantics" and even so one can see > the hard work for consensus. I am concern that a more "revolutionary" > approach would not work. The problem with consensus comes mainly from the fact that the proposal is very limited, and that many people don't see much of a benefit in it. Basically, as far as I understand you, you want some mechanism so that documents self-containedly identify themselves as monolingual documents (if they are monolingual), in a single form and in a form that can easily be accessed by servers and used by browsers. Some technicalities of how that could be done in an uniform way have been discussed. But it is not exactly clear to me why exactly monolingual documents would need some specific identification (I can imagine there are applications where this could be used), or why this should be so much more important than other identification needs that we have to treat it with the special attention it has received recently. Also, in a more "revolutionary" approach, the advantage is that you don't have to interfere with existing semantics, and so it's easier to find a solution that is widely acceptable. Regards, Martin.
Received on Saturday, 8 March 1997 09:33:55 UTC