- From: Daniel W. Connolly <connolly@beach.w3.org>
- Date: Fri, 26 Apr 1996 00:33:10 -0400
- To: Eva <spencer@algonet.se>
- Cc: www-html@w3.org
In message <199604200919.LAA17943@hermes.algonet.se>, Eva writes:
>At 10.59 1996-04-20 -0700, you wrote:
>
>>I was just viewing my own homepages with lynx, and it seems that if a
>ä (the a
>>with two dots over it) is in normal text, it is displayed coorectly. Inside
>the ALT
>>field of an IMG, it is displayed as "ä". Does this mean that ALT can't
>>contain Finnish (or any other umlaut etc) characters, or is this a defect
>in Lynx?
Bug in lynx.
>The specs say:
>"The alt text can contain entities e.g. for accented characters or special
>symbols, but it can't contain markup. The latter is possible, however, with
>the FIG element"
Ummm.. what spec says that? The _expired_ march '95 HTML 3 draft?
Perhaps. Please cite your source.
"can't contain markup" is misleading/wrong -- entity references in
attribute value literals _are_ markup. "can't contain tags" is better,
but still misleading. <img alt="<foo>" src=xxx.gif> is legal, but
<foo> is not treated as a tag.
The HTML 2 spec doesn't explicitly say that entity references count
in attribute value literals, but the SGML spec does, and the HTML spec
does discuss the issue:
http://www.w3.org/pub/WWW/MarkUp/html-spec/html-spec_3.html#SEC3.2.4
=======================
A useful technique for computing an attribute value literal for a given string is to
replace each quote and white space character by an entity reference or numeric
character reference as follows:
ENTITY NUMERIC
CHARACTER REFERENCE CHAR REF CHARACTER DESCRIPTION
--------- ---------- ----------- ---------------------
HT 	 Tab
LF Line Feed
CR Carriage Return
SP   Space
" " " Quotation mark
& & & Ampersand
For example:
<IMG SRC="image.jpg" alt="First "real" example">
=======================
>Entities in ALT don't validate.
I'm pretty sure you're mistaken.
See also:
A Lexical Analyzer for HTML and Basic SGML
http://www.w3.org/pub/WWW/MarkUp/SGML/#sgml-lex
in particular:
http://www.w3.org/pub/WWW/MarkUp/SGML/sgml-lex/sgml-lex#API
====================
Section 7.9.3 of SGML says that an attribute value literal is
interpreted as an attribute value by:
Removing the quotes
Replacing character and entity references
Deleting character 10 (ASCII LF)
Replacing character 9 and 13 (ASCII HT and CR) with character 32 (SPACE)
====================
Dan
Received on Friday, 26 April 1996 00:33:21 UTC