Re: Specifications for the HREF in various tags

On Fri, 16 Oct 1998, Kayshav Dattatri wrote:

> I couldn't find a clear specification on the HREF attribute and I'm
> hoping someone will have the right answer.

The HTML 4.0 specification contains a handy index of attributes:
http://www.w3.org/TR/REC-html40/index/attributes.html

The HREF attribute is defined for the A, AREA, LINK, and BASE
elements. Its exact meanings somewhat depend on the element, and
they are specified in the element descriptions. The value is
all cases**) syntactically a URL or, using a more modern and generic
term, URI. HTML specifications have referred to RFC 1738 and RFC 1808
for URL syntax. The statement in the HTML 4.0 specification (at
http://www.w3.org/TR/REC-html40/references.html#ref-URI ) which makes
a normative reference to a draft (!) should probably now be read as
referring to RFC 2396. It should be noted, though, that one cannot
expect full support to URIs as defined in that draft standard, since
it is quite new (dated August 1998). 
  **) In AREA, the special keyword value NOHREF is allowed, too.

There are other attributes too which are defined as having a URI
as their value, such as ACTION for FORM, SRC for IMG, and DATA
for OBJECT.

As regards to the future development of HTML - which is the topic
of the www-html list - one may ask whether this somewhat confusing
situation should be unified and whether the attribute name HREF
is too cryptic. Originally HREF comes from Hypertext REFerence
I suppose. What <A HREF=...> really means is that it sets up a link,
so a logical notation would be e.g. <LINK TARGET=...>. Alas, both
LINK and TARGET are already in use, with different (although somewhat
related) meanings. (The use of <A NAME=...> for naming link targets
should be deprecated in favor of using the ID attribute.) As regards
to the other URI-valued attributes, their meanings are different
from hypertext links, and the current names reasonably reflect the
meanings. The decision to have different names, SRC and DATA, for
logically similar attributes (for specifying the address of an
image or another object to be embedded) is mildly confusing, though.
On the other hand, current implementations of OBJECT are so horribly
buggy that it would probably be better to rename OBJECT (to start
from a clean board so to say).

> Which of the following HREF is the correct one? Is it a requirement that
> all HREFs be enclosed in " " or is it also correct to enclose them in '
> ..'?
> 1. <AREA SHAPE="RECT" COORDS="5, 9, 53, 36" HREF="default.html">
> 2. <IMG SRC='default.gif' >
> 3. <A HREF=default.cfm>

This is a general question on _attribute value syntax_ in HTML.
There's nothing HREF or URI specific about it.

The attribute values "defaul.html", 'default.gif', and default.cfm
are all correct. However, the last one is correct by accident only,
so to say; if the value contained a solidus (slash, /), as URIs very
often contain, it would be mandatory to use quotes. See
http://www.hut.fi/u/jkorpela/HTML3.2/3.4.html#attrstring

This situation is unsatisfactory, since it allows too many possibilities
without any real gain - saving a few characters can't be that important.
As regards to the risks, see my Saga of the Slashed Validators,
http://www.hut.fi/u/jkorpela/qattr.html

The clearcut solution would be to require quotation marks around
an attribute value _always_.

> Also, does the specification change depending on whether it is an AREA,
> IMG, or A tag?

The syntax of URIs is the same in all of them, although due to different
semantics some URIs don't make sense in some contexts. For example,
<A HREF="mailto:..."> makes sense but it's hard to imagine any meaningful
interpretation for <AREA HREF="mailto:...>. And naturally, the rules
for using the quotes in denoting attribute values are the same.

It might be useful to specify the meanings of and restrictions on
URIs in various contexts in HTML. For obvious reasons, this cannot
be done at the syntax level (in a DTD), but it could still be more
rigorous than in present specifications. For example, one could
specify that the SRC attribute in an IMG element must refer to
a resource with Internet media type (MIME type, RFC 2045) "image".
BTW, I recently noticed an interesting detail in such restrictions:
the HTML 3.2 specification mentions mailto: URLs as one possibility in the
context of the ACTION attribute for FORM, and so does the HTML 4.0
specification in the description of that attribute, but HTML 4.0 spec also
says, at http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13.3.4
that for any other value for ACTION than an HTTP URI,
"the behavior is unspecified".

P.S. Please change the settings of your E-mail client so that
it sends the message as plain text only, not as HTML (except
upon agreement with the recipient(s)).

Yucca, http://www.hut.fi/u/jkorpela/ or http://yucca.hut.fi/yucca.html

Received on Friday, 16 October 1998 02:50:05 UTC