Re: An smtp URL scheme

Russell Steven Shawn O'Connor (
Thu, 10 Jul 1997 23:29:04 -0400 (EDT)

Date: Thu, 10 Jul 1997 23:29:04 -0400 (EDT)
From: "Russell Steven Shawn O'Connor" <>
Subject: Re: An smtp URL scheme
In-Reply-To: <v03102838afeb52d43680@[]>
Message-ID: <>

On Thu, 10 Jul 1997, Walter Ian Kaye wrote:

> At 10:55p -0400 07/10/97, Russell Steven Shawn O'Connor wrote:
>  > On Thu, 10 Jul 1997, Walter Ian Kaye wrote:
>  >
>  > >
>  > >    <A HREF=";body=TheBody">
>  > >
>  > > resulted in the following in Eudora 3.1:
>  > >
>  > >    To:
>  > >    Subject: Test&body
>  > >
>  > > because what that does is make it look like "Test&body" is the entire
>  > > value of the "subject=" field.
>  >
>  > Then Eudora is broken.  If you wanted Test&body you would write
>  > <A HREF=""> (I think %26 is right)
> Eudora is not broken. Why should an email app give a hoot about SGML
> entities? It does not apply. Perhaps SGML is broken...

Either the browser or Eudora has to parse the A element.  Whomever parses
it, should make the subsitutions and store the ``fixed'' form internally
or pass the ``fixed'' form it to the next program. 
>  > > Also, I believe a URL should remain constant no matter what application
>  > > it appears in. Only markup-language applications would even understand
>  > > the &amp; entity -- if I double-click the URL as shown above in Eudora
>  > > (or launch it from *any* other non-browser application), it results in
>  > >
>  > >    Subject: Test&amp;body
>  > >
>  > > which is, of course, incorrect. And this would affect http URL schemes
>  > > as well as mailto schemes, so it's not just an extended-mailto failing.
>  >
>  > Enitites are allowed in attributes.  This allows us to do <IMG SRC="foo"
>  > ALT="and then he said &quot;Let it be done&quot; and it was so">, and
>  > similarly <A HREF="foo.html?lang=fran&ccedil;ais">.  This is why & must be
>  > escaped as &amp; (or something equivlent).
> Now you are confusing "allowed" with "must". I know that entities are
> *allowed* in attribute values; I am saying they should be avoided in
> URLs. The easy way to ensure that nothing will be interpreted as such
> is to escape any semicolons appearing in a field value.

HTML is an SGML application.  First of all semicolons are not always
neccisary.  For example <IMG SRC="foo" ALT="and then he said &quot;Let it
be done&quot and it was so"> is still correct.  Secondly as an SGML
application, an & symbol in a CDATA attribute value ALWAYS indicates the
beginning of an entity.  This allows browsers to more easily parse the
HTML (and SGML in general). If an authour creates a document with a &
symbol before something that is not an entity, then this is an error.  It
is up to the browser (or whatever program is interperting the HTML) to do
it's best to interperate it.

If the browser is passing an attribute to another program, the browser
must preform the appropriate substitutions. Turning &quot; into " and
&amp; into & and whatever else. 

Russell O'Connor            |    
"And truth irreversibly destroys the meaning of its own message"
-- Anindita Dutta, "The Paradox of Truth, the Truth of Entropy"