W3C home > Mailing lists > Public > w3c-wai-eo@w3.org > April to June 2000

Re: Character entities in ALT text

From: Kathleen Anderson <kathleen.anderson@po.state.ct.us>
Date: Sun, 07 May 2000 16:54:35 -0400
Message-ID: <3915D80B.DEFD914B@po.state.ct.us>
To: Harvey Bingham <hbingham@acm.org>
CC: w3c-wai-eo@w3.org
I've put up a page that gives a clear example of what I ran into with
Google (and Barnes and Noble and Network Solutions). 
If you go to:
you will be presented with a before and after of the Google code. 
The 'before' is the code generated by Google; the 'after' is the code
after I cleaned it up.
I added a W3C validator button for the purposes of the example. 

Kathleen Anderson
State Comptroller's Office
Hartford, Connecticut 06106
voice: (860) 702-3355   fax: (860) 702-3634
e-mail: kathleen.anderson@po.state.ct.us
URL OSC: http://www.osc.state.ct.us/
URL ACCESS: http://www.cmac.state.ct.us/access/
AWARE: http://aware.hwg.org/

Harvey Bingham wrote:
> At 2000-05-04 19:03-0400, Kathleen Anderson wrote:
> >Harvey: ...
> >Just to clarify, though - are you speaking of the code and images
> >supplied by affiliate programs? If so, I have another item for your
> >list. Please encourage them to use '&amp;' instead of '&', which doesn't
> >validate and then I have to correct it, which goes against their terms
> >and conditions (you're not supposed to modify the code they supply).
> >Thanks!
> >
> 1. I appreciate your broadening this suggestion to include delivery of images
> with accompanying alt-text consolidated from any affiliated third parties.
> They needn't be advertisers.
> 2. I believe we in the User Agent and Web Content groups have focused on
> what is delivered to the client user. It is possible that affiliate
> programs are called by the portal application supplying the client.
> In sending what they get on to the user, the portal is responsible for
> supplying the alt-text, including restoration of any character entities
> therein that may have been removed by the XML/HTML parser.
> Kathleen reminds us that tools that do not depend on a prior HTML (or
> XML) parser should check text of attribute values for proper use of
> character entities for otherwise syntactically confusing characters.
> An XML Parser normalizes attribute values before passing the value
> of any attribute on to the application by:
>      stripping the surrounding matching pair of single or double quotes,
>      replacing character entity values,
>      discarding leading, trailing whitespace
>      replacing multiple internal white space (spaces, tabs, newlines,
>          linefeeds) by a single space.
> For example, use character entities in attribute values, like
>      <img src="attlogo.gif" alt="AT&amp;T logo">
> The XML-recommended minimum set of character entities are:
>      &amp;  rather than "&"
>      &lt;   rather than "<"
>      &gt;   rather than ">"
>      &apos; rather than "'" within a string surrounded by single quotes
>      &quot; rather than '"' within a string surrounded by double quotes
> Also use Unicode character entities for non-ASCII characters. These have
> either of the forms:
>      decimal     "&#decimal-value;" or
>      hexadecimal "&#xhex-value;"
> For example, the alternatives for "&gt;" are
>      "&#62;"     decimal, or
>      "&#x3e;"    its hex equivalent.
> Of course, such character entities should appear in delivered text content,
> where they are replaced by the parser before passing on to the application.
> Note that whitespace normalization in attribute values may change the
> original and that is not supposed to matter for the interpretation or use
> of such values.
> Also note that the local part of some URIs permits some of those characters.
> I believe they need to be interchanged in attribute values as character entity
> references.
> Regards/Harvey Bingham
Received on Sunday, 7 May 2000 16:56:15 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:29:29 UTC