W3C home > Mailing lists > Public > www-amaya@w3.org > July to September 1999

Re: Summary: Amaya mishandling HREF values that contain ampersand

From: Ewan Mellor <ewanmellor@hotmail.com>
Date: Sat, 10 Jul 1999 13:10:32 BST
Message-ID: <19990710121032.59880.qmail@hotmail.com>
To: www-amaya@w3.org
>From: Irene.Vatton@inrialpes.fr
>
>In-reply-to: Your message of Thu, 08 Jul 1999 18:35:52 +0200."
>              <0FEK00DCZ8RTLB@cuimail.unige.ch>
> > jose.kahan@w3.org said:
> > > If we're making a local file access, we'll then first convert the %26
> > > into '&' then do search the file
> >
> > I guess you're assuming that, in this case, the '&' will be part of a 
>file
> > or directory name, not a field separator in the <searchpart>. Indeed, if 
>you
> > look at RFC 1738, section 5 (BNF for specific URL schemes), you find:
>
>Obviously yes.
>
> >   fileurl        = "file://" [ host | "localhost" ] "/" fpath
> >
> > and
> >
> >   fpath          = fsegment *[ "/" fsegment ]
> >   fsegment       = *[ uchar | "?" | ":" | "@" | "&" | "=" ]
> >
> > In addition, section 3.10, describing the file URL scheme, says:
> >
> >  "A file URL takes the form:
> >
> >     file://<host>/<path>
> >
> >   ...   <path> is a hierarchical
> >    directory path of the form <directory>/<directory>/.../<name>."
> >
> > I interpret this as meaning that a file URL can contain unreserved 
>characters
> > or escaped characters or any of '?', ':', '@', '&' and '='. All these
> > characters will be considered as part of a directory name or a file 
>name.
>
>When a URL which contains a '&' is stored within a HREF attribute it has to 
>be
>encoded either by &amp; or by %26.

It seems to me that it would be smarter to use &amp; as opposed to %26. 
Since it is the HTML(SGML) rules that are forcing an escape here (you may 
not include a bare ampersand in the HREF) we should use the SGML escaping 
mechanism (entities) to make this attribute valid.  That would mean that it 
would go through the "SGML unescaping mechanism", and come out as a valid 
file URL, i.e. containing a bare ampersand.  To escape it as %26 would mean 
that we are using the URL escaping rules to avoid a problem caused by the 
SGML syntax.  I believe that this would work in this particular case, but 
this seems to me to be a bad idea, liable to cause confusion at a later 
date.

Ewan.


______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
Received on Saturday, 10 July 1999 08:10:35 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 April 2014 11:01:32 UTC