- From: Jukka Korpela <jkorpela@cc.hut.fi>
- Date: Mon, 12 May 1997 08:42:50 +0300 (EET DST)
- To: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
- cc: Jukka Korpela <jkorpela@cc.hut.fi>, www-html@w3.org, unicode@unicode.org
On Sun, 11 May 1997, Martin J. Duerst wrote: > - - There seems indeed to be a misunderstanding. It could > have been resolved if you had contacted us (the authors of RFC 2070) > directly instead of just writing lengthy web pages. - - As you may have noticed, I took a look at the published material, the RFC 2070, as well as some other material which disagrees with it. I did not contact personally any of the people with different views. This issue seems to require public discussion. If it were my mis- understanding, it certainly wouldn't be mine only. I didn't know this was such a hot potato. My starting point was an easily observable disagreement and confusion - the existence of mutually incompatible claims about soft hyphen, from usually well-informed sources. > Your understanding of the character U+00AD as a code that is always > visible is based on one sentence in section 6.3.3 of ISO 8859-1: > > A graphic character that is imaged by a graphic symbol identical > with, or similar to, that representing HYPHEN, for use when > a line break has been established within a word. That "one sentence" is the one and only sentence in the standard which describes the appearance and purpose of soft hyphen. Well, there _is_ another sentence, which is remotely related. It is the second sentence in section 7 and says: 'None of these characters in "non-spacing"'. Can you possibly interpret it in a manner which is in contradiction with the "one sentence" above? Or could you interpret the definition of "graphic character" (in 5.5) so that we should ignore the words "has a visual representation normally handwritten, printed or displayed"? > Now it is rather strange that one should have two hyphens (HYPHEN-MINUS > and SOFT HYPHEN) that are always visible. - - I am not aware of (still less responsible for) the motivation behind selecting the character repertoire. I could also say that it is rather strange that one should have two blanks (SPACE and NO-BREAK SPACE). Anyway, speculative questions shouldn't make us ignore the written specification which is quite clear. > Your main problem seems to be the meaning of the words "has been > established". If at some source of text encoding (or on the server > side in modern web technology), somebody establishes that there > may be a line break - - My English is far from perfect, but I think I do know the meaning of "has been established". By your use of "may be", you seem to imply the presence of a word like "possible" before the words "line break" in the specification. Line breaks and possible line breaks within words are quite different things (points where hyphenation has actually taken place versus allowable hyphenation points). > That the SHY is indeed only displayed if it turns up to lie at the > end of a line of rendered text is further supported by the fact > that ISO 10646 as well as the ISO/ECMA registrations and probably > even the ISO-8859-1 original write "SHY" and not "-" in the > appropriate location in the code charts. Are you suggesting that a notational feature should be interpreted so that it cancels a definite verbal statement in the prescriptive part of the standard? (I see little reason to wonder the way soft phyphen is presented in the chart. The specification says that graphic symbol used to image soft hyphen is identical with or similar to hyphen. To me, the existence of two possible presentations justifies well the use of symbolic notation in the chart.) Yucca, http://www.hut.fi/%7ejkorpela/
Received on Monday, 12 May 1997 01:43:18 UTC