sgml-lex: White space in tags?

John Gilmore (gnu@toad.com)
Thu, 19 Sep 1996 14:06:21 -0700


Message-Id: <199609192106.OAB22538@toad.com>
To: www-html@w3.org
Subject: sgml-lex: White space in tags?
Date: Thu, 19 Sep 1996 14:06:21 -0700
From: John Gilmore <gnu@toad.com>

A question came up at my site about whether white space is acceptable
in tags, and I was unable to figure out from the stuff I could find at
the W3.org web site whether this is valid or not.

It's extremely unfortunate that HTML is based on a proprietary spec
that we can't distribute online.  I hope W3C is trying to remedy this
situation.  How much money would it take to pry loose the SGML spec
from ISO for public distribution without restriction?  I can attempt
to provide or raise this money, if they have a price.  If they refuse
to permit public use at any price, I think the HTML community should
duplicate the work (to the extent that we need it) and separate from
the SGML community.

I tried reading the HTML lexical analyzer to answer the question, but
it uses features of flex that I've never seen before and don't
understand.

Here's the specific issue:

    When doing HTML anchors (links), the closing ">" on the <A HREF...> 
    element needs to be in contact with the rest of it:

    <A HREF="/pub/join/index.html">Join EFF today</A>!

    not:

    <A HREF="/pub/join/index.html"
    >Join EFF today</A>!

    Netscape is smart enough to parse the 2nd example, but many other 
    browsers aren't.

I think this is incorrect; I hope the spec allows arbitrary white-space
inside the < ... > delimiters.  But, it's sad but true, I can't find
a spec for this.

Besides answering the question, can someone on this list put the
answer where other people can find it?  It would be nice if a
human-readable and definitive lexical standard for HTML was available,
and w3.org seems like a good place to put it.

	John Gilmore
	Electronic Frontier Foundation