Re: When is something a tag? from David Håsäther on 2005-06-20 (www-html@w3.org from June 2005)

From: David Håsäther <hasather@gmail.com>
Date: Mon, 20 Jun 2005 12:59:16 +0200
To: Robert <rvl@xs4all.nl>
CC: www-html@w3.org
Message-ID: <42B6A184.4050801@gmail.com>

On 2005-06-19 22:37, Robert wrote:

> I like to know when something is a tag.
> For example <p> is a tag, but < p> is not (usually displayed as text).

Right. In order for a start-tag to be recognized by a SGML parser it
must start with a character called STAGO (start-tag open). In the
reference concrete syntax, which HTML uses, this abstract character is
defined as "<". After STAGO, it must be immediately followed by a name
start character (which in the RCS is the alphabet, both lowercase and
uppercase).
In your second example, there is a space between "<" and "p", and
therefore this will mean just data to the parser.

> I would like to know if the definition of when it is a tag is defined by 
> a version of (X)HTML, or if this is defined by SGML (for HTML) or XML 
> (for XHTML).

Yes, this is defined in SGML and XML.

> And do browsers implement its own ideas of when something is a tag?

Yes, they do (in text/html), because they use tagsoup parsers, instead
of real SGML parsers. I can't think of any browser that has a problem
with a space between "<" and the first character, though.

> Could you point me in the right direction for this specification?

If you're really interested, look into The SGML Handbook
<http://www.amazon.com/exec/obidos/ASIN/0198537379/charlesfgoldfars/002-6439808-6619234>

A cheaper way would be to look at the production for a start-tag:
http://www.w3.org/MarkUp/SGML/productions.html#prod14

and an end-tag:
http://www.w3.org/MarkUp/SGML/productions.html#prod19

-- 
David Håsäther

Received on Monday, 20 June 2005 10:59:41 UTC