Re: HTML Streaming

Peter Flynn (pflynn@imbolc.ucc.ie)
30 Aug 1997 22:38:54 +0100 (BST)


Date: 30 Aug 1997 22:38:54 +0100 (BST)
From: Peter Flynn <pflynn@imbolc.ucc.ie>
In-reply-to: <970830134954_1120546419@emout06.mail.aol.com> from
To: Albertfine@aol.com
Cc: www-html@w3.org
Message-id: <199708302138.WAA24975@imbolc.ucc.ie>
Subject: Re: HTML Streaming

> Streaming is a "technique for transferring data." "processed as a steady and 
> continuous stream" is only a description.  "With streaming, the client 
> browser or plug-in can start displaying the data before the entire file has 
> been transmitted." This is not a result. This is an example.

Read again and try to understand what it says. Streaming is NOT the
display of data before the entire file has been transmitted: it says
**with** streaming, the client...etc. Streaming is an activity
external to the display of data: it is a manipulation of the
transmission to achieve a steady flow of data.

> Their is no 
> mention of "IP transmission", "refreshing of the data", etc.

This is why you need to study and learn these things instead of
relying on what you find on the Web. 

> I think you're a little confused about streaming. As I have said
> before, it is not the fault of the browser, editor or the coder but
> the file format. 

I'm afraid it's you who is confused: streaming has nothing to do with
the file format, unless you are using the word in a new way, in which
case you need to provide a formal definition.

> For example, a wav file is not designed to be heard
> "before the entire file has been transmitted" so you don't get a
> "steady and continuous stream." 

I think you'll find that a WAV file is in RIFF format, under which
it specifies in the rLen field how long the rData chunk is, within
which the dLen field specifies how long the dData waveform data is.

This is equivalent to an HTML file carrying its own extent in the
header, in the same way that the TEI DTD does. The difference is that
a WAV file contains utterly homogenous data (all waveform, once you
get there), whereas HTML data is heterogenous. 

The two simply are not comparable, as the audio player is not required
to perform any interpretation of the data, merely to reproduce it as
sound through the speaker, whereas a HTML browser has to do a lot of 
work making sense of the data.

> Viewing a page in a "steady and
> continuous stream" is a consideration and attributes are added to
> enable streaming such as with the table or img tags. 

But think about it: how can you view or display the data in a steady
stream if it is not arriving in a steady stream? The method of making
it arrive at a uniform pace is called "streaming" and it has nothing
to do with the _content_ of the datastream.

> They are not added to all the tags and their is no overall
> description of the page. I think their should be a general
> description for size and other attributes that should then be added
> to the events tag in the head. It would be sent first, give an
> overall description of the page and attributes wouldn't need to be
> added to a new tags or existing tags.

That would certainly be another possibility. But you still haven't
done anything about the _rate_ or steadiness at which the data arrives
at the browser end.

> >I can do this today with a simple macro in any decent SGML editor 
> >(and by adding an attribute to P such as EXTENT NUMBER #IMPLIED).
> 
> The extent tag is not an attribute. 

Read again. I proposed an EXTENT _attribute_, not an element.

> A browser wouldn't recognize it.

Nor would it recognize your <EVENT> element.

> it. For this to work, you would need to rewrite HTML. This is what I
> am trying to avoid.  

No you're not. By inventing the <EVENT> element you are rewriting HTML.

> Maybe you are thinking of the width attribute for the pre tag?

I think not. You seem to be confused by rather a lot of this. Perhaps
it would be a good idea if you read a book on HTML and SGML. Mine was
published in 1995 [1], so it doesn't include the latest WebTV or
Netscape elements, but everything it says about HTML design remains
valid.

> >This worries me. You don't seem to realize that "a letter" and "a
> >space" are not finite quantities when dealing with regular type: you
> >have to know _which_ letter, because they're all different widths. It
> >is therefore not possible to perform such a calculation as you
> >envisage, based solely on the number of glyphs expected. 
> 
> Actually, their are all the same lengths or paragraphs wouldn't line up :)

Only in Lynx, the W3C linemode browser, and w3-mode. All the other
browsers I know use a variable-width font as their default, in which
each character has its own width (ie an "i" is narrower than a "w").

> Your paragraph would look like this;
> 
> I wandered lonely as a cloud.
> 
> The number of letters and spaces is actually 29. 

Please don't change my examples. The number of characters in my
example was 32, and I explained why this was so, because I
deliberately indented the second record so that it included extra
spaces, in order to test your proposition. 

> An HTML editor would not do this. 

Why not?

> If it were human written code, it would be assembled by a program 
> before it could be described as a streaming HTML file. I miss you point.

Yes, I think you do. Have you seen what the current clutch of real
SGML editors can do?  You might want to read up on some of the
background on what SGML offers before you go any further: there's a
whole lot in there that current browsers ignore, and that XML browsers
are probably going to take advantage of.

> >Would it be fair to compare this to the way in which an old terminal
> >used to stream the text across the screen because it displayed each
> >character immediately it was received? 
> 
> Which old terminal?

A VT100, Wyse, Hazeltine, whatever. There were hundreds of them which
worked that way (NOT IBM mainframe terminals: they were pagemode
devices).

At this stage I think you might need to go and study something more
about computing and networking principles, and data design, before you
go any further.  Right now I'm not certain you are well enough equipped
for this argument or to defend your position.

> The streaming protocols are meant to describe all displayed tags not just 
> text.

And please stop calling elements tags. There is a big difference, and
until you have a firmer grasp of HTML and SGML you may want to shelve
this proposal. It's very interesting, and I think the underlying idea
has a lot of merit, but without a firm grounding in the relevant
background you may run into trouble, especially from the gureaux, who
are mostly CS postgrads, and will make mincemeat of the proposal as
it stands.

> >There certainly would be. Have you actually looked at a non-trivial
> >piece of SGML markup?
> 
> Yes. The degree of error is hard to predict without knowing what the browser 
> is doing. I plan to have a copy of Cougar modified.

Go and have a look at a more complex DTD: Cougar is not intended for
production use. And study some serious markup: have a look at
http://www.ucc.ie/celt/online/G1-0005.a550-a1616.sgml

> "Steady and continuous" does not mean "even." 

Yes it does. That is precisely what it means.

> To have a "Steady and 
> continuous" stream their is a lot of compression, a lot buffering and the 
> way the data is sampled and sent is far from "even."

Those are some of the ways evenness is achieved.

> >editors to keep its value updated with the byte count of the current
> >content: but I still feel that this is not a useful value.
> 
> Why would the editor want "to keep its value updated with the byte count of
> the current content?" What does that mean anyway?

It means precisely what it says. It's what you proposed. You suggested
an editor should be able to insert the size of the following paragraph
in an EVENT element (I suggested an EXTENT attribute, but the effect
is the same).

> I thought coming to this list to discuss HTML streaming, while in 
> development, would be beneficial.

Yes it is, it's made me think quite hard about it, and I've concluded
that browsers could do almost synchronous display right now with only
trivial modifications to HTML. Thanks for raising the subject.

///Peter