Re: about SGML parser
To: Rainer Klute <firstname.lastname@example.org>
Subject: Re: about SGML parser
From: email@example.com (Don Park)
Date: Tue, 23 Jan 1996 11:38:07 -0800 (PST)
From firstname.lastname@example.org Tue Jan 23 14: 39:30 1996
X-Mailer: Windows Eudora Version 1.4.4
How about putting SP in a thread and have it go to sleep whenever it
starves? I am working on a HTML scanner which uses this technique to
support both push and pull usage model. Looking at the SP source code, I
see that some modifications will be necessary still.
BTW, my scanner is not finished and it is NOT a parser so please don't pelt
me with handitoverson comments. It is just a tokenizer which returns a
stream of high-level tokens which are objects representing different types
of HTMLElements. It does not even know that start tag and end tag are
related. Also, it is based on MFC so it will not fit in with libwww.
However, it will support new tags at runtime so that you can decide to be
pure HTML 3, Netscape friendly, or whatever. I will describe its design in
the near future so W3C can take advantage of it in its own HTML parser design.
A comment regarding use of Arena's HTML parser in libwww:
After looking at Arena's parser, I think using SP in libwww is the better
and safer route although SP is quite a bit bottom heavy and requires C++
I guess the question now is whether to have libwww take the full leap into
C++ by using templates, namespaces, dynamic-casting, etc., or stay by the
seaside :). <-- (that is a period, not a mole)
>>I know that there are several SGML parser frees on
>>the internet such as SP, YSP etc, why don't we
>>integrate one of them into W3C Lib so that W3C
>>Lib will support SGML and we don't worry about
>>the parser whenever a new version of HTML appear.
>We are doing just that. However, due to the nature of SP's input
>model (it is a requesting stream, not a driven one) we have to rape
>the stream concept a bit in our first approach. The HTML input is
>completely dumped into a file. :-( SP will read from this file,
>parse it and push the tokens further down the stream stack.
> Dipl.-Inform. Rainer Klute NADS - Advertising on nets
> NADS GmbH
> Emil-Figge-Str. 80 Tel.: +49 231 9742570
>D-44227 Dortmund Fax: +49 231 9742573