Re: [URN] Re: URI documents

Larry Masinter (
Wed, 7 Jan 1998 00:03:46 PST

Message-ID: <>
Date: Wed, 7 Jan 1998 00:03:46 PST
From: Larry Masinter <>
To: "Patrik =?iso-8859-1?Q?F=E4ltstr=F6m?=" <>
CC: Harald Tveit Alvestrand <>,
Subject: Re: [URN] Re: URI documents

Patrik Fältström wrote:
> At 12:59 1998-01-06 +0100, Harald Tveit Alvestrand wrote:
> >- The class of identifiers that, roughly speaking, start with
> >  a short string and a colon, and go on in a charset-limited way.
> >  All the URI axioms you cite are axioms of that class.
> >- The class of identifiers that, in addtion to being of the first
> >  class, obey certain additional rules, such as hierarchy,
> >  hostname representation and so on.
> >  None of this is necessary for the URI axioms; they are vitally
> >  necessary for today's day-to-day usage of the World Wide Web.
> >
> >
> >If this is the case, we have more issues:
> >
> >- Is the #fragment rule a "type 1", "type 2" or "none of the above, but
> >  should be mentioned in both places"?
> It depends on if you talk about the syntax (using the octet with value '#'
> in US-ASCII as a special octet in the URI sequence) or if you talk about
> the functionality. I.e. the conclusion is that it has to be mentioned in
> both. The character '#' is a special in the URI syntax, and must be treated
> as such for all URI schemes. The argument is that it is (as it is in RFC
> 1730 if I am not mistaken) to be used as a fragment specifier. In the URL
> syntax paper one can more definitely talk about what a fragment specifier
> is, and how it is to be treated for URLs (if it is the fact that this is
> something that _have_ to be treated exactly the same way for all URL schemes).
> I.e. the syntax is one thing, and the "semantic interpretation" of the
> octet is something different when found in a URI sequence (which as
> mentioned in the character set thread started by Larry) is something
> different (maybe) from the "character in the URI".

If we just change the *title* of draft-fielding-uri-syntax-XX
and remove the word "Semantics", it might make things clearer.
The only normative part of the specification is the definition
of the syntactic processing. There's some general advice about
how schemes might define semantics, too, but they're not part
of what it's defining.

I should point out that the syntax (and any scheme-specific semantics)
are assigned to the character sequence, not to any octet sequence.
In fact, the mapping of character sequences to octet sequences is
part of the semantics that a scheme specifies. That's the reason
why some schemes might employ different encoding mechanisms than

If we attempted to remove any indication that the URI document did
anything more than specify the syntax of URIs and how that syntax
should be processed by URI-processing software, with any semantic
interpretation of the *meaning*, do you think we could get beyond
the current impasse?