- From: Shelby Moore <shelby@coolpage.com>
- Date: Mon, 16 Dec 2002 23:46:07 -0600
- To: John Lewis <lewi0371@mrs.umn.edu>
- Cc: www-style@w3.org
>> The "correct" solution would be a CSS style. Whether it can be >> implemented is another matter[...] > >I agree. If you don't use a markup language that determines what >sentences are, you need something like a :sentence selector (or >similar solution). But none of that matters until someone manages to >create an algorithm to decide what a sentence is--that actually >works--for most languages. Until that happens, it can't go in CSS. >(*If* it happens--I'm not sure if it's even possible in English, >especially if you include nonstandard English.) My intuitive bet is that 90+% accurate algorithms probably already exist, even we aren't aware of them. Natural language processing is apparently quite advanced from when we were in academic setting and keeping up with research in many diverse disciplines. My bet is Microsoft has proprietary research (and plans to use it commercially) given the billions they spend on research and Gates's vision with tablets and expanding the universe of what computers can do. For example, I know when I was doing research on image quality metrics, Microsoft people were involved in some of the latest research. I don't have time to go off on another tangent in natural language processing, but I do question the assumption here that computers can not read grammar. Maybe they still can not. More likely, it is probably more of an issue of resources, patents, etc.. Also even with simplistic algorithms, even 20% error rate might be an improvement over the current results, given that using a style is optional and given that the worst is a space is slightly more narrow or wider following some of the rarer constructs in English grammar which are mistaken by simplistic parser for end of sentence. I can speak for other languages, but I would assume latin languages have similar grammar complexity. On the one hand, I think a specification in CSS is needed to open the door to implementations (algorithms that might be out there or just around the corner). On the hand, I yield to the experts here on this list. I have only been here 2 days and I just wanted to make a simple suggestion. I happy that so many are now aware of this issue, and felt strongly enough to think about it and comment. I think that in itself is a significant accomplishment. >Today, you have a few solutions, none of which are very good. Thanks. That is what I was trying to say. > The only >one that doesn't include content to force a presentation is by >manually wrapping every sentence in span. And then you're adding tons >of markup for a tiny stylistic issue. I did not even consider that option, probably for fear of the impact it would have on search engine rankings. >For these reasons, I think an end of sentence character is the best >solution (and the only solution that could actually be put into >practice without a huge amount of work). I agree it is another option that might be useful. > The big problem is that >people won't use it--but I don't think it's wrong to deny people the >ability to because the average person is wrong. Another problem is >that it's extra work for the author--but as it shouldn't be required, >that's not a big deal. In the case of Cool Page, when a user enters a double space after a period, then we know with reasonable accuracy this is intended to be end of sentence. So these we could probably convert to single space with EOS character. However, without the parser, we could do no better for users that type single spaces (most users). This is why I said I did not think it was the best idea. But as another option, maybe it is worthwhile. >PS: Every typography book I've read has insisted (and my own >experience leads me to believe) that two or more spaces after a >sentence make text harder to read (primarily because it creates large >white gaps in running text, and in some cases causes diagonal or >vertical lines of white space inside running text). My understanding is that Jakob Nielsen argues that on the web, useability takes precedence over other factors that are normal priorities in other forms of presentation. And one of his first claims (way back when focus was on eliminating bad design) was that users _SCAN_ web pages. E.g. that users could not be expected to read something from start to finish, unless they had first skimmed it to determine it's relevance. How can you scan (speed read), if you can't quickly find the beginning of sentences? Yet he quotes typography experts on the issue of single or double space. IMO, that is an example self-contradiction. What do typographers know about the web? Typography has a long history in metal layout. Only relatively recently converted to high resolution presentation (printing) devices. > I don't think this >is any less true on a low resolution device (like a cellphone or PDA). >If it is, and redesigning the typeface can't help, then neither can >CSS (as it cannot increase the resolution of a display device). Then IMO you don't understand all typography and aliasing issues well. Notice from my signature that I was (minor) co-author and publisher of FONTZ!, one of the first ever wysiwyg font editors in the world (on GEM os before MS Windows). When designing a font for a low resolution device, there is trade-off between tight kerning that looks reasonable with word spaces, and spaces between sentences that can be distinguished rapidly. Especially when the period becomes a single pixel. It is mitigated by using fonts well designed for the web (Verdana, etc), but many users prefer to use the plethora of poorly designed fonts available for free download on the web. Font use sells a lot of Cool Page, even it is crock because we can not (yet) guarantee the visitor has the font. Also there are many other issues that come into play, which I have mentioned already. One that people conveniently ignore on this list so far is Accessibility for people who are blind in one eye like I am or other visual handicap. Then there is just pure user preference. Why do we give the user the ability to set a yellow font in 3 point size on white background? Because it isn't our job to decide what a user wants. BTW thanks for your well reasoned reply. I am flattered by the interest that was taken in this issue. I never expected. I thought it would be ignored :) -Shelby Moore
Received on Tuesday, 17 December 2002 00:45:32 UTC