Re: <resouce>, links, and addresses [was: W3C and Handle technology ]

Foteos Macrides (MACRIDES@SCI.WFBR.EDU)
Fri, 13 Sep 1996 19:34:29 -0500 (EST)


Date: Fri, 13 Sep 1996 19:34:29 -0500 (EST)
From: Foteos Macrides <MACRIDES@SCI.WFBR.EDU>
Subject: Re: <resouce>, links, and addresses [was: W3C and Handle technology ]
To: galactus@htmlhelp.com
Cc: www-html@w3.org
Message-id: <01I9FVTEFPPE004X6S@SCI.WFBR.EDU>

galactus@htmlhelp.com (Arnoud "Galactus" Engelfriet) wrote:
>In article <9609121218.ZM28046@gaia.ckm.ucsf.edu>,
>"Marc Salomon" <marc@ckm.ucsf.edu> wrote:
>> Arnoud "Galactus" Engelfriet:
>> |<P ID=someid>
>> |<A NAME=someid>The</A> first example we will consider is...
>> |</P>
>> 
>> Special cases like these are why architectural specifications not relating
>> to character issues should probably not be done in an i18n draft, unless
>> there is some other compelling reason that I'm missing...
>
>Agreed. I was just wondering where a browser that supports both
>methods would jump to if asked for "#someid".

 	They are not two fundamentally distinct "methods".  They are
two forms of markup for specifying the same thing -- a named anchor.
In HTML 3.0 ID was to replace NAME for A as well (with recommentation
to recognize NAME as a synonym for ID, for "backward" compatibility
with "historical" clients).  Your question in effect is "What error
recovery will clients use for that invalid markup (non-unique name
tokens)?".  The answer will be client specific.  Lynx, for example,
recognizes ID attributes for virually all tags, as well as NAME
attributes for the tags which traditionally used that, and handles
them as named anchors for all tags in which they can correspond to
a position in the body of the document.  It guarantees uniqueness of
the tokens in a document by ignoring any repetitions.  Another client
might replace the previous with the repetition.  I don't think either
client would be "right" or "wrong" ( though if one is, it's probably
Lynx :).

	The "compelling reason" that Marc missed is that ID designates
named anchors in all relevant IETF RFCs this year, most notably in
RFC1942 (TABLEs), as was stated in the section of i18n that I cited.
Any client which does not treat ID attributes in any of the table
tags as named anchors is not compliant with RFC1942 ( Lynx doesn't
construct tables, but recognizes all the RFC1942 tags and any ID
attributes they have as named anchors, so any HREF="#foo" for them
will take you to wherever their text ended up in the document. :)


On the Subj: RE: Extended URL for frames
galactus@htmlhelp.com (Arnoud "Galactus" Engelfriet) also wrote:
>In article <199609131121.EAA29460@switzerland.it.earthlink.net>,
>"David Perrell" <davidp@earthlink.net> wrote:
>> I've noticed a hurdle for this proposal: although fragments are
>> described vaguely as 'not considered part of the URL,' '#' is declared
>> an 'unsafe' character in RFC1808, and as such is proscribed for use
>> within fragments. Too bad - substitution of another character would
>> make for a less elegant construct.
>
>It's RFC 1738 I think (1808 is just relative URLs), but the reason
>that # is specified as unsafe is because it's used to delimit URLs
>and fragments. Having a filename with a # in it would lead to 
>confusion if it wasn't escaped.
>
>However, if the frame sequence comes *after* a fragment, it shouldn't
>be a problem.

	Same deal.  If you add unescaped '#' characters beyond the valid
fragment delimiter, you're counting on client-specific parsing behavior
and/or error recovery.  In the "test", MSIE did what was desired, and NS
didn't.  Was one "right" and the other "wrong"?  Some clients don't get
tripped up by invalidly interdigited containter tags.  Would you say, now,
that it shouldn't be a problem?

				Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 MACRIDES@SCI.WFBR.EDU         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================