Re: HTML Streaming

sugalsd@lbcc.cc.or.us (Dan Sugalski) wrote:

>You're not going to get significant enough compression to make it
>worthwile. Adding even 30% overhead to a web page to make it render faster
>is not worth it, and I think you'll find it difficult to get much smaller
>than that and still stay within the limits of a less-than-seven-bit

Right now, I am not concerned with the degree of compression. I agree, it 
does seem it would be inefficient but the global description is not even 
complete. Their are more things to consider than text. It may be very 
different from it is now and may have a better degree of compression. If it 
is not an error free description it can still be of use. The W3, unlike the 
members of this list, thinks that things should at least be tried to see if 
their are any benefits. Take the width attribute in the pre element. It uses 
a character only description.

   width = integer

          This attribute provides a hint to visual user agents about the
          desired width of the formatted block. The user agent can use
          this information to select an appropriate font size or to
          indent the content appropriately. The desired width is
          expressed in number of characters. This attribute is not widely
          supported currently.

Btw you wouldn't need this attribute if events existed. If a better 
description comes along, such as characters and spaces, it wouldn't need to 
be changed or wait for support. This is one of the benefits of events. The
call 
element could also help here.

>character set. (Honestly, since you're looking for accurate rendering,
>you're going to have to have an intimate knowledge of the font metrics used
>to render the page, and you just can't have that at page-generation time. A
>truly accurate mockup of the page is an impossibility, alas)

You can calculate the font metrics from the panose number. If it is the 
base font, it can also be calculated before the text is actually downloaded.

>I think the assumption that this is going to be done exclusively by HTML
>editors is something that's going to have to get jettisoned. If you work on
>that assumption then you might as well pack it in now, since you'll never
>get a significant enough user-base to get any of the significant browser
>makers to bother with it.

I imagine it could be a program separate from the HTML editor similar to an 
assembler. The program would take editor written or hand written code and add
pre rendering attributes, organize text etc. I think HTML editors are a
significant part of user base. One comes with every copy of Communicator. 
HTML is becoming more complicated. Using notepad or bbedit is becoming very 
inefficient. Simply editing an HTML file is one the considerations behind 
style sheets. I think users will consider its use for its benefit and not
where 
it is available.

>Also, from what I've seen you're counting on tools and info that's not
>easily available to the majority of the engines generating web pages, i.e.
>CGI scripts and database-driven pages.

It would not be impossible to do, though. An incomplete events would still
be of some help to the browser. For example, starting the java compiler would

be of help to many modular browsers.

>Actually, it *is* a printing problem, and it would be as inaccurate as I
>say. Worse, really. When printing, you can make some very valid page size
>assumptions (8.5x11 or A4 paper) which you can't make for browsers. A web
>page in 12 point times is going to have a significantly different layout in
>a 400x400 window and a 1200x900 one.

Technically it is a display problem. Font degradability is only classically a

printing problem. I thought looking at it from various perspectives would 
help. It really makes no difference if the layout of the window is 400x400 
or 1200x900. The browser would know how much room it would have to display
similar to the printer knows the size of the paper. The only real difference 
is that it may break at any particular point. In printing, this is not really

a concern because the information is already displayed.

>That pretty much leaves you describing individual words and the rectangle
>they take. English has an average word size less than six characters. Do
>you really think you can do it in an average less than 2? (Don't forget
>you'll need both width and height because you can't assume that font
>metrics won't change from word to word, or even character to character)

This is only one way of describing text. It fails with the addition of more 
complicated HTML like HTML math. The description that I am developing now 
works from its own database and describes a larger variety of data. Again, it

may not be error free but would be of some benefit.

>And you did specify you were shooting for lossless, and you just can't have
>that. Now, if you shoot for 'mostly accurate' and take a good guess at
>standard Times metrics, your assumption will probably be valid, or close
>enough for several varieties of times on different platforms, for most
>(60-80%) of the people viewing the page. OTOH, 20-40% of the people *won't*
>be able to use your assumptions, thus incurring the overhead of downloading
>a data description they can't use.

If you assume all the fonts are described with the panose matching system;
this would not be a problem.

>By all means, work it out, it's a good exercise. I think, unfortunately,
>you'll find that the increase in size your additions make will make will
>end up slowing down the ultimate display of the page enough to make it
>counterproductive.

You take a very pessimistic view of this. It may have been the way I 
introduced it; very rough and in the developmental stage. Too late :) I think

events already show numerous benefits. I just finished checking the load 
times of various plugins and compilers versus download time of the HTML file,

placement of the commands and browser shell commands. Their are numerous 
benefits to the program and even the os.

Albert Fine

Received on Monday, 8 September 1997 07:55:09 UTC