- From: John Udall <jsu1@cornell.edu>
- Date: Mon, 14 Apr 1997 09:18:09 -0400
- To: www-html@w3.org
At 04:54 PM 4/13/97 +0300,Jukka Korpela <jkorpela@cc.hut.fi> wrote: >On Fri, 11 Apr 1997, Terje Norderhaug wrote: > >> A better idea would be a shared dictionary of words and how they are >> splitted residing on the network. > >Hyphenation is a strongly language-dependent issue, so what we basically >need is support to different languages (including the HTTP level features >and the proposed LANG attribute at the HTML level). For example, in >English documents hyphenation is usually not desirable, whereas in >Finnish documents it is often crucial for good-quality presentation >since words are often very long; and in Finnish hyphenation can mostly >be done on algorithmic basis (without dictionaries), and if high-quality >hyphenation is desired, one really needs program which performs >morphological analysis (in addition to using a dictionary). Most languages >are probably somewhere between, but one should _not_ assume that >dictionaries or explicit hyphenation by authors are are the universally >correct approach. > Jukka makes a very important point here. Hyphenation is very strongly language dependant. Finnish can be hyphenated on an algorithmic basis. Some other languages can be hyphenated on the basis of rules, as Abigail mentioned. Others must use a dictionary, because there just aren't any rules that cover all the cases. At 06:12 PM 4/11/97 -0400, Paul Prescod <papresco@calum.csclub.uwaterloo.ca> wrote: >How often do you read documents in more than 2 or 3 languages? How many >do you speak? Why not just download and cache those dictionaries? Hyphenation should be tied to the LANG attribute rather than an entire document. Many people, for example students, work with documents containing 2, 3 or even 4 languages. I speak 4 languages: French, German, Czech and English. This past year when I was studying Czech, our textbook had examples in both Czech and Russian with explanations in English. I don't think that this situation is at all uncommon. The point is that hyphenation is something that, for the most part, should be carried out on the client side and should be automated. This is especially true for users of windowing browsers, where they can re-size the window to a different width on the fly, requiring re-hyphenation of words, and preferably not forcing a re-connection to the server and the page to be reloaded across the network. Users need to be able to force hyphenation, using the ­ entity for example. They also need to be able to turn off hyphenation for certain spans of text, like for URLs or poetry or whatever. Better support for entities and for internationalization in general by the browser manufacturers would be nice. It is world wide web after all. -John >Yucca, http://www.hut.fi/~jkorpela/ > > John Udall, Programmer/Sys. Admin. Extension Electronic Technologies Group (EETG), Cornell Cooperative Extension, 40 Warren Hall Cornell University, Ithaca, New York 14853 (607) 255-8127 jsu1@cornell.edu
Received on Monday, 14 April 1997 09:18:54 UTC