Re: HTML parser question wrt support for APPLET etc. from Bob Racko on 1998-09-22 (www-lib@w3.org from July to September 1998)

From: Bob Racko <bobr@dprc.net>
Date: Tue, 22 Sep 1998 11:20:13 -0400
To: Henrik Frystyk Nielsen <frystyk@w3.org>, www-lib@w3.org
Cc: John Punin <puninj@cs.rpi.edu>
Message-Id: <3.0.3.32.19980922112013.03397850@shell14.ba.best.com>

Go ahead and make the mods, John.

I would be happy to help you if there is anything not understood
or if you want me to test what you have coded before you send it
back to Henrik and/or commit your changes.

The scanner as it exists is fast.  As the core to a verifier you
want it to remain so - particularly for the most common and
standard tags.  I have been working toward a mirroring utility
(a robot which copies whole pages, frames and all) and 
introduced the tags you see because they have a 'src=' attribute.
The one I was going to do next was SCRIPT via HText_appendObject
although all the ones you mentioned need to eventually be handled.

The distinction between appendObject and appendImage
is for those items that need further parsing (like html) and those
that must be considered as-is (pictures, binary, compiled-java, etc).
In the robot, -saveimg applies to the latter.

Here is the direction I am taking:
After doing one or two tags as you see way (changing the html scanner and
pdtd)
I saw that better handling of _all_ unknown tags via a callout-registration
would put less stress on the HText interface.  Currently, if you add a
new function you must do it in all potential clients and sample apps
(robot, browser, html-to-C converter, etc) even if it is a place-holder.

Registering a callout for unknowns makes it possible to add tags
without modifying any of the existing sample apps.

I had previously considered making the scanner tables more dynamic
to allow the addition of tags and attribute-parsing functions
once the library was initialized.  The return-on-investment
was not good enough though so I backed out of that set of changes.

Apologies for not answering sooner, my excuse is a good one
 - I was on a mesa in Arizona.
The Hopi hosted a multi-tribe harvest celebration this weekend.
Most indian dances are rarely open to the public so this
was a special treat.  I am back in Boston now.

>>>John Punin wrote:
>>>I would like to ask you a question. I read the HTML.c file of the library
>>>and I see the FRAME tag produce a call to the function HText_appendObject.
>>>The IMG tag produce a call to the function HText_appendImage.
>>>I would like that the HTML parser handle tags like: APPLET, OBJECT, EMBED,
>>>AREA that reference other URLs. My question is: Do you think that APPLET 
>>>tag should call HText_appendObject or a new function like
HText_appendApplet
>>>should be created in HTML.c ?.

>>Henrik Frystyk Nielsen wrote:
>>I talked to Bob Racko about this as well and we agreed on that it was
>>better to use the HText_appendObject for these new tags.
>>
>>Thanks!
>>
>>Henrik

>At 01:03 9/17/98 -0400, John Punin wrote:
>Hi Henrik
>Ok. Is Bob Racko going to make the modifications to HTML parser?. 
>I will made the modifications if you want me to. 
>
>Please feel free to send my message and your reply message to the list.
>
>Thanks for your help.
>John

{-----}
bobr@dprc.net

Received on Tuesday, 22 September 1998 11:33:01 UTC