W3C home > Mailing lists > Public > uri@w3.org > February 2002

Re: [Fwd: Re: [xml-dev] creating a URI class]

From: Al Gilman <asgilman@iamdigex.net>
Date: Sun, 17 Feb 2002 11:25:38 -0500
Message-Id: <200202171625.LAA1534746@smtp1.mail.iamworld.net>
To: "Simon St.Laurent" <simonstl@simonstl.com>, uri@w3.org
Cc: elharo@metalab.unc.edu
** summary:

URIs and URI-references are different classes.  Each has its application.

Fragments are resource-representation-type-defined.  Comparing #fragment
subexpressions "pointing into" resources without determining the [presumed]
type on which hypothesis one is comparing is erroneous.

URI semantics is federal.  The URI class defers lots of things such as case
sensitivity to the several schemes.  The scheme may defer still futher.  Don't
look for URI-wide rules as to when tokens coughed up by the URI-common syntax
are part of case-insensitive namespaces.  The rules already tell you it 'taint
so.

** details

At 03:06 AM 2002-02-17 , Martin Duerst wrote:
>Fragments are case-sensitive. 

Fragments are not part of a URI.  The Java class is comparing URI-references,
in the language of the URI RFC.

For an example where comparing URI-references would be wrong and comparing URIs
proper would be right, consider lookup in a cache.  Here if you have retrieved
a resource, regardless of the #fragment subexpression in the original or
current URI-reference, the cached copy is a candidate to satisfy yet another
request.

Going the other way, there are service offerors of HTTP servers today who try
to discourage 'deep linking' that is to say, they do not offer that all
'Location' values passing the lips of their server are fit to keep around as
persistent references.  Something closer in link trail metric to the home page
(root of the path tree at that domain name) should be used for a session start
at another time or by another user.  For these sites it is not safe to ask "Are
these URIs equal" without also asking the hierarchical precursor "Are the
domains of these URIs equal."

Some sites want to offer you a valedictory message when you leave.  The
rational implementation for this is an event thrown in the client based on a
structured compare of URIs activated.  Not the way it is done today.  The
exit-message-location would be a property of a scope defined in URI space.  Not
ECMASCRIPT guards on individual URI-references embedded in each page issued
from the fare-well-wishing scope.

In any compare of URI-references, if they are not found to be equal on cursory
inspection, what to do next is not necessarily a hard failure; and a structured
analysis of what is the same and what is different is an animal of some
interest.

Fragments pointing into HTML resource representations are case-sensitive. 
Fragments pointing into XML resource representations, while still a matter of
cotroversy just what they may and may not be, will almost certainly be case
sensitive.

This does not amount to a basis for the blanket statement that Martin made.

There's room for an exception.

>Everything in URIs is case-sensitive unless otherwise stated.

Somewhere.  But not necessarily stated in URI-universal documents.

>Domain names are case insensitive, not only in http.

Because the URIs that include them are built on the DNS naming practices, which
yield a case-insensitive space.

For another example, newsgroup names are used in 'news' URIs.  These should be
considered to be case-insensitive to agree with the definition of this
namespace in RFC-977.

YMMV 

In the land of SOAP [no radio] at least in our dreams all this will be
case-sensitive, internatioal-safe symbol spaces.

URIs roll up all the legacy of RFC-822 and its contemporaries.  URI language is
a camel of a language in that regard.  Or shall I say, it organizes [syntactic
differentiation] the babel into a tower, but doesn't save you learning the
several sublanguages assembled into the tower.

Al

>
>Regards,    Martin.
>
>At 18:27 02/02/16 -0500, Simon St.Laurent wrote:
>>I'm curious whether this URI class (part of Java 1.4) really passes
>>muster.  In particular, I'm wondering about whether its equals() method
>>is true to the different notions of equality in the different schemes.
>>
>><http://java.sun.com/j2se/1.4/docs/api/java/net/URI.html>http://java.sun.
com/j2se/1.4/docs/api/java/net/URI.html
>>
>>Any thoughts?
>>
>>
>>-----Forwarded Message-----
>>
>>From: Simon St.Laurent <simonstl@simonstl.com>
>>To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
>>Cc: xml-dev@lists.xml.org
>>Subject: Re: [xml-dev] creating a URI class
>>Date: 16 Feb 2002 18:24:11 -0500
>>
>>On Sat, 2002-02-16 at 17:02, Elliotte Rusty Harold wrote:
>> > FYI, there is a java.net.URI class in Java 1.4. You might just want
>> > to use that, and even if you don't you could learn from it. See
>> >
>> >
<http://java.sun.com/j2se/1.4/docs/api/java/net/URI.html>http://java.sun.com
/j2se/1.4/docs/api/java/net/URI.html
>>
>>Thanks!  1.3 is currently my target JDK (and will be for a while if I
>>shift to a Mac for development), but this is interesting.  I'm
>>especially curious how the equals() method works:
>>
>><http://java.sun.com/j2se/1.4/docs/api/java/net/URI.html#equals>http://ja
va.sun.com/j2se/1.4/docs/api/java/net/URI.html#equals(java.lang.Ob 
>>ject)
>>
>>-----------------
>>  For two URIs to be considered equal requires that either both are
>>opaque or both are hierarchical. Their schemes must either both be
>>undefined or else be equal without regard to case, and similarly for
>>their fragments.
>>
>>For two opaque URIs to be considered equal, their scheme-specific parts
>>must be equal.
>>
>>For two hierarchical URIs to be considered equal, their paths must be
>>equal and their queries must either both be undefined or else be equal.
>>Their authorities must either both be undefined, or both be
>>registry-based, or both be server-based. If their authorities are
>>defined and are registry-based, then they must be equal. If their
>>authorities are defined and are server-based, then their hosts must be
>>equal without regard to case, their port numbers must be equal, and
>>their user-information components must be equal.
>>-------------------
>>
>>In particular, I'm curious whether fragments are case-insensitive, and
>>some schemes (like HTTP) regard case as insignificant in the domain
>>name.  Hmmm... maybe I'll post this to uri@w3.org.
>>
>>--
>>Simon St.Laurent
>>Ring around the content, a pocket full of brackets
>>Errors, errors, all fall down!
>><http://simonstl.com/>http://simonstl.com
>>
>>
>>-----------------------------------------------------------------
>>The xml-dev list is sponsored by XML.org
<<http://www.xml.org/>http://www.xml.org>, an
>>initiative of OASIS <<http://www.oasis-open.org/>http://www.oasis-open.org>
>>
>>The list archives are at
<http://lists.xml.org/archives/xml-dev/>http://lists.xml.org/archives/xml-dev/
>>
>>To subscribe or unsubscribe from this list use the subscription
>>manager: <<http://lists.xml.org/ob/adm.pl>http://lists.xml.org/ob/adm.pl>
>  
Received on Sunday, 17 February 2002 11:25:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:04 UTC