Re: Overview of html2thot.c? from Irene.Vatton@inrialpes.fr on 2000-01-27 (www-amaya-dev@w3.org from January 2000)

From: <Irene.Vatton@inrialpes.fr>
Date: Thu, 27 Jan 2000 16:38:43 +0100
To: Chris Cutler <chip@sccs.swarthmore.edu>
cc: www-amaya-dev@w3.org
Message-Id: <200001271538.QAA16219@tahiti.inrialpes.fr>

In-reply-to: Your message of Wed, 26 Jan 2000 22:41:57 -0500."
             <Pine.LNX.4.21.0001262234330.23738-100000@merlin.sccs.swarthmore.e
du>
> Hi,
> I'm attempting to modify Amaya so that it will read files written in a
> markup language of my own devising.  It looks like (among other things) I
> need to write the equivalent of html2thot.c for my language.  I'd like to
> use html2thot.c as a guide while I do it, but I'm feeling a bit
> overwhelmed by the size of the file.  Could someone give me an overview of
> the structure of this file?  What are the important functions?  What
> precisely does this file do and how does it get called?  In the comments
> of I've noticed references to a stack and to an automaton.  What are these
> and how do they work?  
> 
> Any help would be appreciated.  Thanks very much.
> 
> -Chris Cutler
> 

In html2thot.c the main procedure is called StartParser.
That function creates the initial document, open the file and launches the
parser (HTMLparse) which scans each input character. 
At the first level, it uses the automaton to detect entities, start and end 
tags.
At the second level it uses the tables of HTML elements and HTML atributes to
know what Thot element or attribute type must be generated.
I guess you have to build your tables there.
Many other functions are there for correcting HTML errors or for adding 
specific
work (see EndOfStartTag, EndOfEndTag).

Hope that helps
  Irene.

Received on Thursday, 27 January 2000 10:40:29 UTC