W3C home > Mailing lists > Public > www-html@w3.org > February 2000

Re: identify XHTML DTD by URI, not by FPI

From: Dan Connolly <connolly@w3.org>
Date: Thu, 17 Feb 2000 00:49:11 -0600
Message-ID: <38AB99E7.B849D710@w3.org>
To: Murray Altheim <altheim@eng.sun.com>
CC: Arjun Ray <aray@q2.net>, www-html@w3.org
Murray Altheim wrote:
> This is really quite simple. Let us keep using:
>     PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
>     "http://www.w3.org/MarkUp/Group/1999/WD-xhtml-basic-19991125"

I find that acceptable, though I'd like confirmation that you're
agreeing to remove the bit about modifying the system identifier
"as appropriate".

At the risk of being obnoxiously redundant, I'll go on to discuss
my preference...

> Dan Connolly wrote:
> [...]
> > I do not propose to foreclose any options. Members of the Web Community
> > are free to use FPIs if they find them to be valuable. But you have
> > not made any argument (other than by assertion) as to what value
> > W3C would derive from the use of an FPI to identify the XHTML Basic DTD.
> And I've heard nothing remotely persuasive that would give cause to
> remove an existing functionality, ie., that created by the availability
> of both public and system identifiers. It should be up to you to show
> cause for removal of completely legal, common, and demonstrably useful
> XML syntax, not us for its continuance.

It's the "demonstrably useful" bit that I question. I'd really
appreciate pointers to evidence of that.

FPIs are just baggage, as far as I can tell.

I'm talking about new FPIs here. For public texts that had FPIs
before anybody considered giving them a URI, those FPIs continue to be
But when making up a new name for something, I can't see
any reason why make up two of them.

> The answer I've provided is that some find public identifiers
> very useful

Again, I've seen that claim, but nothing behind it. Who finds
them useful? And for what uses? And cannot those uses be
served by a URI?

> and see no justification from you that public ids are in any
> way harmful or that the functionality they afford is not desired or even
> required in some environments.

required? what environments require them?

> What more do you require?

I require nothing more. My critical objection was with
the stuff about the system identifier being modifyable.

My comments about not using an FPI are more of a preference
and a request for justification.

> Several weeks of
> intense debate? Should we spend so much time and energy on every question?

I think it's worth a considerable amount of time and energy, if it
saves us from ever revisiting this question again. But perhaps
that's not feasible...

> I look back on the history of this process both in ours and other WGs and
> it is quite apparent that you expect responses to your questions to be
> taken much more seriously and given much more weight than those coming
> from 'the Whole World'.

No, I don't. I think I'm just more persistent than most folks.
As with anybody else, the WG's obligation to me is to
	-- convince me to withdraw
	-- accept my suggestion, or
	-- escalate the issue

> And what the hell is this really all about, Dan? I generally have a lot
> of respect for your opinion, but you've really lost me on this one. Is
> this just filibustering, or is there some deep architectural issue that
> you might let us in on?

I consider it a fairly deep architectural issue that public
texts have exactly one canonical name, and that said name be
useable in all the handy contexts that URIs are usable in:
XLinks, RDF assertions, etc. But evidently opinions differ on that.
So without appeal to architectural principles, I'll try to
restate my argument:

1. The DTD shall to be available at least one address
(i.e. under one name) in http://... space:

1.a W3C is obliged (by anti-trust law, among other things; long
to make its results available to the public
royalty-free. The only distribution mechanism cheap enough
for us to do that is the Internet.

1.b The most convenient mechanism for the expected consumers
of this spec to get at the DTD is to make it available
as its own 'file' on the Web; it's all well and good to
include it in <pre>...</pre> in the text of the spec,
but to make everybody use lynx -dump or whatever to extract
it is not a good use of everybody's time when it's
totally trivial for us to just stick it in a
.../html.dtd file on the server next to the text of the spec.
As to where in the web... ftp://www.w3.org/ might actually be more
convenient for some of the consumers, but it turns out to be kind
of a pain for those of us that run the service, not to mention
costing the network twice as many TCP transactions, so
grant me that we can use http://www.w3.org/..../html.dtd (or .mod or

2. That one name/address (let's call it Idtd) is sufficient:
(I mean one name/address per public text entity.)

2.a You might expect that we'd mirror it at lots of addresses
for availability reasons. I think this is more trouble than
its worth because
	(a) we can (and do) make it available via
	geographically distributed servers *at that same address*
	(using DNS trickery that we shouldn't have to,
	if web clients would only honor multiple A records in DNS.
	But that's an entirely separate rant...)

	(b) I don't want to clog up caches; I want caches
	to be able to recognize that they've already got Idtd
	thing if they've already got it, no matter where we
	got it from. Likewise, I want your link to turn
	purple if you've already visited Idtd.

	When metadata is deployed so that
	mirrored copies carry an RDF assertion that says
	"I'm a copy of Idtd; if you've already got Idtd, you don't need me,
	and if you fetch me, you've got Idtd"
	and caches and UAs pay attention to such metadata,
	then we could use copies at lots of different addresses
	without compromising "you've been here" functionality.
	But lots of addresses that all say "I'm a copy of Idtd"
	is probably never going to be operationally as
	efficient as just making Idtd highly available.

	(c) I want to be able to refer to it succinctly by URI.
	The IETF *finally* started making RFCs available
	at http://www.ietf.org/rfc/rfcNNNN.txt . But before
	that, you had to say: "RFC1630, available at,
	among other places, ftp://ds.internic.net/.../rfc1630.txt ...".
	And caches and history lists and so on were foiled.
	There was no one place to anchor an XLink-based annotation
	or RDF assertion, or back-link service, or ... .

2.c The SGML Open catalog mechanism makes
for a perfectly good cache, keyed on URI. So folks that like
to download the DTD, store it locally, and refer their software
to it via an SGML Open catalog get their needs met with Idtd.

2.d While we generally make our specs available at a 'latest version'
address as well as a 'this version' address (Idtd), that those two
address return the same content today doesn't make them names for the
same thing: one (Idtd) is a name for the particular historical
artifact that we're publishing,
and the other (let's call it Idtd-updates) is the name for a service
that gives you that text or any
update to it, at the discretion of W3C. The binding between Idtd
and the content you get there never expires, but the
binding between Idtd-updates and the text you get there
is subject to change (unfortunately, without notice, presently;
we should put something like Expires: 1-week-from-today as
an optimization).

3. any other name is superfluous and counter to the
general principle of simplicity.

> This is really quite simple. Let us keep using:
>     PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
>     "http://www.w3.org/MarkUp/Group/1999/WD-xhtml-basic-19991125"
> and not force us to use:
>     SYSTEM
>     "http://www.w3.org/MarkUp/Group/1999/WD-xhtml-basic-19991125"
> or whatever it is this week.
> Murray

Dan Connolly
Received on Thursday, 17 February 2000 01:49:52 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:05:53 UTC