Re: The UR* scheme registry, Citing URL/URI specs

Dan Connolly (connolly@w3.org)
Fri, 24 Oct 1997 18:11:44 -0500


Message-ID: <34512B30.7142@w3.org>
Date: Fri, 24 Oct 1997 18:11:44 -0500
From: Dan Connolly <connolly@w3.org>
To: Keith Moore <moore@cs.utk.edu>
CC: Larry Masinter <masinter@parc.xerox.com>,
Subject: Re: The UR* scheme registry, Citing URL/URI specs

Keith Moore wrote:
Larry wrote:
> > I think that's a good summary of the situation. HTML and XML
> > can say they use URIs, and then point to a W3C note that
> > says "A URI is defined by IETF, currently it points to URLs,
> > and there is some work on URNs".
> 
> This sounds reasonable.

:-{

OK. If that's what you want me to do, that's what I'll do.


> This discussion reminds me of the discussion about the use
> of the term "charset".  I18N experts want to use different
> terms: "character set", "character encoding scheme", and so
> forth, because they're very concerned about the differences
> between these.  MIME has its own notion of "charset"
> which isn't quite either of the above.  Most people can
> use terms "charset" or "character set" without needing
> or caring about such precision, and without being misunderstood.
> (unless they're talking to an expert...)
> 
> Note, however that most technical specifications aren't written
> for "most people" ... they're written for experts.  Someone
> implementing HTML may well need to know the difference between
> URLs and URNs -- or at least that URLs aren't the only kind of
> URI.

Hmmm... there's a set-theoretic definition of the terms
"coded character set" and "character encoding scheme" i.e.:

	coded character set:
		a function that maps integers to characters
	character encoding scheme:
		a function that maps sequences of octets
		to sequences of characters

And the MIME usage of charset *is* consistent with the
definition of character encoding scheme as used in
ftp://ds.internic.net/rfc/rfc2130.txt
ftp://ds.internic.net/rfc/rfc2070.txt
http://www.w3.org/TR/REC-CSS1#appendix-c
and the HTTP spec and HTML specs and ...

I can show you how the software will break if an implementor
confuses the term "coded character set" and "character
encoding scheme" there are test pages all over the net
that demonstrate it. (I'll find one in a little bit...
somebody wanna help?)
There are broken browsers that do broken
things with fonts and stuff as a result.

So if the URL/URN distinction is that way, please, PLEASE
show me! Please give an example where the use of the
term URN vs URL vs URI in the HTML, HTTP, XML, or RDF
specs will break things. (I'm talking about the specs
that *use* UR*s, not the specs that define them.)

-- 
Dan Connolly, W3C Architecture Domain Lead
http://www.w3.org/People/Connolly/