- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 23 Apr 2003 16:31:41 -0400
- To: uri@w3c.org
- Cc: pat hayes <phayes@ai.uwf.edu>
Okay, I'm taking a stab at some new text for the introduction to
2396bis. I haven't worked out the exact 'diff' details but I'm happy
to if it comes to that. This text goes somewhere around section 1.1,
but if this slight reframing is accepted, I think some later sentences
will have to be rephrased.
I bravely/naively think this roughly meets the needs of all comers
(including the newcomers-to-this-list like Pat Hayes), and I'm sure
people will be happy to tell me where I'm wrong. If this addresses
your needs better than the current text, a brief comment to that
effect would be appreciated.
===============================
1. Being a URI
A URI is a string which conforms to the URI syntax given in this
document. This restricted syntax serves several purposes:
a. It excludes certain characters. This allows
systems to use those characters to delimit URIs.
b. It defines each URI as beginning with a "scheme"
name followed by a colon. This allows independent
development and deployment of systems which offer
URIs additional semantics and functionality.
c. It defines a few characters (including "/") to have
special "hierarchical" meaning, to allow for "relative"
URIs.
d. It defines a character-escape mechanism (using "%")
to allow special characters (like "/") to be used as
normal characters, without their special URI meaning.
e. It keeps them syntactically distinct from some other
short, formally-specified strings, so they can sometimes
be intermixed or used to flag an extension to a protocol
or when "webifying" systems.
2. The Identification Function (RIF)
There is a single relation, called the "RFC 2396bis
identification function" ("RIF"), which maps from each URI to
exactly one thing at any point in time. Some shared knowledge of
this relation is essentially to communcation using URIs, but
complete shared knowledge is rarely possible. The central efforts
and standards related to URIs concern techniques for sharing
knowledge of this relation sufficient for particular applications.
a. We call the objects in the range (set of possible
output values) of RIF "resources". This term is not
intended to exclude anything and the range of RIF is in
no way restricted. Every person, place, event, physical
object, imaginary character, ... anything and everything
is in the range of RIF and is technically a resource --
but calling something a "resource" suggests that it is
likely to be identified by a URI in practice. For
example, the integer zero is technically a resource
(since everything is a resource), but calling it a
resource would be misleading outside of a context where
URIs were actually being used to denote integers.
[[ Thus RDF bNodes and literals can be said to identify
resources, even if there is no URI in use, because
assigning a URI would be reasonable and may happen
automatically in some software. ]]
b. Resources can be further divided into "bound
resources" and "unbound resources". Bound resources are
in the codomain of RIF; they are in fact identifiable
through RIF from some URI. Unbound resources are not in
the codomain of RIF and cannot be identified through the
RIF mapping from some URI. Not all resources can be
bound because there are more resources than URIs. Since
it may not be possible to know whether a given resource
is bound, the boundness distinction should be used with
care, or used with respect to a particular URI scheme as
in, "Since my new book does not yet have an ISBN, it is
an unbound resource with respect to the isbn: scheme."
[[ That's trying to address Mike Mealling's requirement.
http://lists.w3.org/Archives/Public/uri/2003Apr/0055 ]]
c. Elements of RIF SHOULD NOT change over time, since
such changes will render shared knowledge false until
corrected. If changes do occur, it is sometimes said
"the resource has moved", and appropriate notifications
and forwarding SHOULD be made. The term "moved"
suggests that a URI is a location for a resource, and
this is a common metaphor, but it is only a metaphor.
Information changing over time can be handled without
changing RIF through various techniques such as having
the resource itself be a function mapping the current
time to a resulting value.
3. URI-Scheme Languages
In addition to serving as an argument for the RIF function and
thereby identifying a resource, each URI MAY contain encoded
(serialized) information. The syntax and semantics of the encoding
language are determined by the normative specificiation registed
with IANA for the scheme name and MUST be subordinate to the syntax
and semantics given in this document.
a. Scheme languages SHOULD be declarative in nature,
with the URI text conveying knowledge either directly or
indirectly about the identified resource. An example of
a direct assertion is the "data" scheme, where the URI
text fully describes the identified resource. An
example of an indirect assertion is the "http" scheme,
where the URI text conveys the network address of a
server which can communicate on behalf of or about the
resource.
=================================
That's it. IMHO it nicely refactors some tricky issues, but I surely
can't claim to understand them all. Probably the biggest thing is
pulling RIF out of the intrinsic nature of URIs and being explicit
about it. Also I think it's important to be clear that using a URI
as a an argument to RIF is almost totally different from decoding it
according to some scheme-specific language.
-- sandro
Received on Wednesday, 23 April 2003 16:31:45 UTC