Re: URL better than FPI

On Fri, 18 Feb 2000, Russell Steven Shawn O'Connor wrote:

> I'm a little surprised to find myself saying this, but after
thinking
> about it for a while, I've come to the conclusion that a URL identifier
> for XHTML is as good as, and probably slightly better than an FPI
> identifier.
> 
> I was pretty convinced that the URL, was just a location,  Such and such a
> file on a particular machine, retrieved by a particular protocol, whereas we
> want an identifier that says, this the the XHTML DTD, which is
> independent of protocol, and machine.
> 
> But lets look at the URL more carefully
> 
> http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd
> 
> It has (more or less) 3 parts: ``http'' ``www.w3.org`` and
> ``TR/xhtml-basic/xhtml-basic10.dtd'', plus some separators for parsing.
> 
> A more logical order is
> 
> (1)www.w3.org
> (2)http:
> (3)TR/xhtml-basic/xhtml-basic10.dtd
> 
> because first you connect to a machine, then use a the protocol with the
> file.

Not quite.  You can't just "connect to a machine".  A TCP connection
involves a host *and* port combination - and a *defaulted* port is
associated with a protocol (the 'http:' is what allows you to omit the
':80' after 'www.w3.org'.)

> Lets look at each part, starting with (3).  TR/xhtml-basic/xhtml-basic10.dtd
> is just a name.  

Yes... in relation to a protocol.

> (2) is the protocol.  But because URL's are uniform, the meaning of (2)
> is irrelevant to us.  

What does "uniform" mean?  Why should http be irrelevant?  (What if
it's ftp?)  

> This brings us to (1), the most important part. [...]
>  www.w3.org isn't a machine, it is a virtual machine name.  

Yes.  DNS at work:)

> An important point is that the W3C owns all names in (1) of the form
> ...w3.org.

Yes...

> Now lets look at 
> 
> -//W3C//DTD XHTML Basic 1.0//EN
> 
> It has 3 parts too:
> (1) -//W3C
> (2) DTD XHTML Basic 1.0
> (3) EN
> 
> But (2) + (3) is the same as TR/xhtml-basic/xhtml-basic10.dtd.  

Yes.

> The .dtd says it's a DTD, lots of use of xhtml-basic.

We don't know that from '.dtd' (although a Certain Big Company would
love it if you fell into the habit of making such assumptions;))  
And, btw, the public text class 'DTD' in (2) curiously enough means
'document type declaration subset' (ISO-8879/10.2.2.1) - note that
fourth word!

> www.w3.org corresponds to -//W3C.  

Not quite.  w3.org (the domain name, rather than the host name) is a
better analogue.

> Both indicate the the following name
> (TR/xhtml-basic/xhtml-basic10.dtd or DTD XHTML Basic 1.0) is to be
> interpreted as a key for the a table controlled by the W3C.

Yes.

> But the W3C doesn't actually doesn't own -//W3C like it owns
> www.w3.org, and anyone can make a document with the FPI -//W3C.  

True.  

Have you seen K.4.6 "Internet domain names in public identifiers" of
the WebSGML TC?  http://www.ornl.gov/sgml/wg8/document/1955.htm
You could have something like this:

   +//IDN w3.org::www//DTD 
       XHTML 1.0//EN//http:/TR/xhtml-basic/xhtml-basic10.dtd

(which, amazingly enough, is best compared with a gopher string!)

> So really URL's a better in this respect.

No.  They're exactly the same.  The real problem is that, under the
current rules, a URI can't be the minimum data following the PUBLIC
keyword.  Of course, at root, this is just legalistic mumbo-jumbo, and
the SYSTEM keyword is the official *kludge* to get around this
"problem".  

That is, there should never be a need for a PUBLIC *and* a SYSTEM
identifier.  All you need in a document is a name - its internal
syntax is irrelevant (except for *verification* purposes).  Internal
syntax becomes important for an address, and all addresses should be
in catalogs.  Make a reasoned argument that catalogs are *never*
necessary, and you have a case that an address is as good as a name.
 
> So surprisingly, the URL is actually independent of machine name
> (because of virtual machine names) and independent of protocol
> (because of uniformity).

Please explain this "uniformity" bit.  What happens with ftp?


Arjun

Received on Friday, 18 February 2000 14:15:45 UTC