- From: Dan Sugalski <sugalsd@lbcc.cc.or.us>
- Date: Fri, 05 Sep 1997 11:57:54 -0700
- To: Albertfine@aol.com, www-html@w3.org
- Cc: sugalsd@lbcc.cc.or.us
At 01:42 PM 9/5/97 -0400, Albertfine@aol.com wrote: >sugalsd@lbcc.cc.or.us (Dan Sugalski) wrote: > >>If you're going to maintain any sort of backward compatibility with current >>HTML, I think you'll find that any error-free description of the data on a >>page won't be much, if any, smaller than the actual data itself. Also, > >A text description could be compressed. It would probably be a series of >numbers. If for example, a pattern repeats such as a row of three ones you >can represent it with a single number. Their are other ways to compress the >description. Remember, this is being done by the HTML editor. You're not going to get significant enough compression to make it worthwile. Adding even 30% overhead to a web page to make it render faster is not worth it, and I think you'll find it difficult to get much smaller than that and still stay within the limits of a less-than-seven-bit character set. (Honestly, since you're looking for accurate rendering, you're going to have to have an intimate knowledge of the font metrics used to render the page, and you just can't have that at page-generation time. A truly accurate mockup of the page is an impossibility, alas) I think the assumption that this is going to be done exclusively by HTML editors is something that's going to have to get jettisoned. If you work on that assumption then you might as well pack it in now, since you'll never get a significant enough user-base to get any of the significant browser makers to bother with it. Also, from what I've seen you're counting on tools and info that's not easily available to the majority of the engines generating web pages, i.e. CGI scripts and database-driven pages. >>since you're going to need to use font metrics to accurately make those >>descriptions, you're going to end up with an inaccurate description for a >>significant segment of the clients viewing your page. > >Font degradability is a big problem. It is almost a printing problem. This is >really not my field. I think panose is the best solution. A base font could >solve other problems. A database with a basic description of fonts would also >help. These solutions would not be complete but not as inaccurate as you say. Actually, it *is* a printing problem, and it would be as inaccurate as I say. Worse, really. When printing, you can make some very valid page size assumptions (8.5x11 or A4 paper) which you can't make for browsers. A web page in 12 point times is going to have a significantly different layout in a 400x400 window and a 1200x900 one. That pretty much leaves you describing individual words and the rectangle they take. English has an average word size less than six characters. Do you really think you can do it in an average less than 2? (Don't forget you'll need both width and height because you can't assume that font metrics won't change from word to word, or even character to character) And you did specify you were shooting for lossless, and you just can't have that. Now, if you shoot for 'mostly accurate' and take a good guess at standard Times metrics, your assumption will probably be valid, or close enough for several varieties of times on different platforms, for most (60-80%) of the people viewing the page. OTOH, 20-40% of the people *won't* be able to use your assumptions, thus incurring the overhead of downloading a data description they can't use. >Of course, this only describes regular text. What about weird things like >HTML math? I am working on a global description of a text. The main problem >seems to be that it breaks at a variety of points. It really becomes a >description of a series of squares and rectangles. The descriptions is not >meant to only describe text. So I am considering other ways of describing a >variety of data. I am sorry if it seems incomplete but I am working on it >right now. By all means, work it out, it's a good exercise. I think, unfortunately, you'll find that the increase in size your additions make will make will end up slowing down the ultimate display of the page enough to make it counterproductive. Dan ----------------------------------------"it's like this"------------------- Dan Sugalski (541) 917-4364 even samurai Programmer/SysAdmin have teddy bears Linn-Benton Community College and even the teddy bears sugalsd@lbcc.cc.or.us get drunk
Received on Friday, 5 September 1997 14:53:55 UTC