- From: Russell O'Connor <roconnor@math.berkeley.edu>
- Date: Tue, 27 Feb 2001 12:32:34 -0800 (PST)
- To: Irawan Tanudirdjo <irwtan@yahoo.com>
- cc: W3C HTML <www-html@w3.org>
On Tue, 27 Feb 2001, Irawan Tanudirdjo wrote:
> Shallom,
>
> I'm an undergraduate student of Computer Science
> from Surabaya, Indonesia.
>
> Right now, I'm having a compiler class project to
> create a HTML interpreter and viewer. So, I would
> like to ask about HTML tokens, lexemes, regular
> expression and grammar.
>
> Could anyone help me point out in the web, the
> documentation that contains the above specification?
There is no such documentation on the web that I know of. I suggested
getting a copy of "The SGML handbook" by Charles F. Goldfarb. It should
contain everything you need to know to write an SGML, and hence HTML 4.0
parser.
I should point out that there is a good chance that HTML 4.0 is not
defined by a context free grammer.
For XHTML, you can read "The Annotated XML Specification" at
<http://www.xml.com/pub/a/axml/axmlintro.html>.
--
Russell O'Connor roconnor@alumni.uwaterloo.ca
<http://www.math.berkeley.edu/~roconnor/>
``Paradoxically, a refusal to `put a monetary value on life' means that
life is often undervalued.'' -- Artificial Intelligence: A Modern Approach
Received on Tuesday, 27 February 2001 15:32:40 UTC