Re: FPI's in NOTATION declarations from David G. Durand on 1996-11-29 (w3c-sgml-wg@w3.org from November 1996)

From: David G. Durand <dgd@cs.bu.edu>
Date: Fri, 29 Nov 1996 14:05:15 -0500
To: w3c-sgml-wg@w3.org
Message-Id: <v02130501aec4cb0fdfbf@[165.90.139.118]>
[Quick summary before the blow-by-blow:

   The issues about FPI/URN versus URL semantics we have been discussing
are divisive. On the URN list I have watched years of such discussions with
barely a position changed on the issues. If we take the URI approach, we
will have a way to a truce, painlessly. Or we could add PUBLIC, but we will
not ever get anything like consensus that this is useful, only acclaim form
those who want it and acquiescence from those who believe it misguided. I
add a new suggestion, that by allowing a _list_ of URIs in the SYSTEM ID,
we can have all the advantages of PUBLIC, without the drawback of the new
syntax. (I don't know what this drawback is, but some people obviously feel
that there is one).]

Prologue to the blow by blow:

   I have been on the URN list for at least 3 1/2 -4  years. As I remember,
most of that time was taken up by discussions of how URLs and URNs are
really the same thing, only "they work". Much of this discussion reflects
an implementors assumption that there must be _one single_ resolution
mechanism for names. The idea of names, however, is that the _do not_
depend on a single resolution mechanism. They are an attempt to _label_
resources. A unique label can help you to find something, but cannot
guarantee that any particular resolution mechnism will work to find what
you seek. This is in the nature of things.

   Nonetheless, many people find names to be useful.  It seems
self-evidently correct to me that we _need_ names. It also seems
self-evidently mystical, unimplementable, and foolish to the other side. On
the URN list I remember  no-one changing their minds in years of discussion
of these issues.

   That experience motivates my proposal that we enable URI's in XML, and
lobby for an FPI URN syntax now -- the URN proposals are mature enough that
we could do this pretty safely. Then those of us who want names would have
them. We can then banish the philosophical issues from the XML effort,
without shorting either side permanently.

The only thing that is wrong with this proposal, is that we lose SGML's
fail-safe (a SYSTEM identifier to be used if you can't resolve the PUBLIC
identifier). We could keep my compromise proposal and still get this
benefit, by allowing more than one URI in our SYSTEM field. The rule would
be that _any_ of the URIs in the field could be used, depending on
application preference.

    -- David

At 11:40 PM 11/28/96, lee@sq.com wrote:
>There are several things that are being confused here:
>
>[A] whether to allow PUBLIC "some string" syntax in XML;
>[B] if such syntax is used, whether to make use of it;
>[C] if such syntax is used, whether to retain SYSTEM identifiers;
>[D] if such syntax is used, how to interpret the string;
>[E] if the string is to be used and interpreted, whether to use
>    the SGML OPEN CATALOG file format, or a subset of it, to convert
>    the string into a sequence of data octets

>notation discussion deleted

>I think that for [A], HTML compatibility may be a good reason to
>vote Yes, but that for [B], I have still seen no compelling reasons.
>The strongest has been Debbie & Paul P. wanting to be able to
>change the meaning of FPIs on the client side.  This is convenient if
>you have a local copy of something (URN-like), and not very SGMLish
>if your copy differs from the "authoritative" copy.  In both cases, you
>should really be overriding a SYSTEM identifier.

Why should you have to change a document to read or preocess it, if you
have a copy of a uniquely named resource, and a unique name referring to
that resource in the document? You are making a hidden assumption that FPIs
don't work by assuming that the local copy will be different. Given the
number of people who have already claimed that this procedure _works_ for
them, the assumption is not only hidden, but unwarranted (in the sense that
counterexamples exist).

>If you are referring to the Latin 1 entities and their esteemed colleagues,
>though, I agree that it is convenient to have a local copy, and this
>can save a not inconsiderable delay in network latency right now.

But there are lots of things that have the same properties as the SDATA
entities, in different contexts: The TEI DTDs, the CALS DTDs, DocBook, etc.
etc. This is one of the things people naturally do.

>The difficulty is in having a sufficiently small set of features
>to do what is needed without losing too much functionality.
>
>So, here is my question:
>    What mechanisms are proposed for including SDATA entity sets?
>
>    I have just agreed to chair an SGML OPEN group trying to come up
>    with a way of specifying the replacement text for SDATA entities;
>    if that can be system independent, it may be very useful in a
>    future version of XML.
I should be useful in a current version of XML -- I am waiting for David
Birnbaum to come up to date, so he can comment on the desirability of
describing characters by strings rather than by numbers. For that matter,
SDATA entities fit nicely with the DSSSL character model, which has an open
set of characters denoted by strings, doesn't it?

>    In the meantime, it is likely that you will need to have your own
>    version of the Latin 1 entities, with replacement text suited for
>    your local system.

This may or may not be the case: some systems use the ISO SDATA text already!

>    So how do you find it?

by matching FPIs according to _some_ resolution system. The resolution
system is _NOT_ a part of XML, because names are intentionally decoupled
from resolution systems.

>    Is it useful to be able to override other FPIs?
>    Please give real, concrete examples.
>
>If your name isn't Paul Prescod :-), you can stop reading now...

Sorry, but if it's on the list, it's fair game.

>Paul Prescod <papresco@csclub.uwaterloo.ca>
>> The second is that FPIs can be "redirected" by the information consumer or
>> maintainer, where as URLs can only be redirected by the information
>> provider.
>
>I already pointed out that this isn't necessarily true.
>Also, you are making an assumption about FPIs -- that the user can affect
>their resolution.  There is nothing in ISO8879 that mandates that --
>quite the reverse, if anytyhing.  I haven't checked the FPI standard,
>as I don't have access to a copy right now (sorry).

redirecting HTTP URLs is certainly possible, but URLs do have a semantics
defined, which states a particular resolution mechanism, which is dependend
on DNS names, specific data transfer protocols, etc. An FPI does not have
_one_ resolution mechanism, so it can be resolved by the user un-"aided" by
hardired resolution mechanisms that may not be convenient (like IP
connections from a notebook).

>> I suppose you could redirect
>> URLs, but that seems like a nasty semantic mixup. If an identifier has a
>> machine name in it, then the information should, at least logically, come
>> from the machine (even if a cache "represents" that machine).
>(1) URLs do not contain machine names except by coincidence.
>    They start with Internet domain names.
>(2) A URL is a coordinate in cyberspace.  To put it another way, a URL
>    is indeed a resource's location, but that needn't be a resource
>    that exists.

1. if you read the URL spec, it tells you how to resolve the name, and it
involves DNS. DNS is _not_ a stable namespace, and does _not_ guarantee
that DNS names will not be reassigned.

2. If we resolve some URLs by a different mechanism, that is
location-independent, and resolution mechanism independent, then we simply
have FPIs under another name.

>> Some people feel strongly that there should be a syntactic differentiation
>> between *names* and *locations*.
>
>Possibly.  We will have that with URNs.

Yes, but the current XML spec does not allow URNs. That is one way to
address this problem. That is a fix that I have been re-iterating, and
re-iterating. I am willing to act like a broken record, but would rather
not.

>> A concept cannot have a URL. It can only
>> have a name, or perhaps a URN.
>
>I think this shows that you are confused about URLs -- but we should
>take that discussion offline.
>It is not meaningful to say that a concept has a URL.  It is meaningful
>to say that you can use a URL to represent a concept, though.
>If you must have it point at something, point at a textual description.


>> I think that the first time you "click" on a
>> "concept:" you'll wish that they were distinct also.
>I was referring to URNs, not the HREF attribute of an HTML <A> element.
>You are going to have a hard time dereferencing your TeX Book example
>FPI too, if you "click on it".

Concept is not a valid URL protocol. The fact that that prefix is called a
protocol prefix tells you something about the semantic intent of the field,
too. Paul is making the point that URLs are supposed to index a resource
via a resolution mechanism. Obviously we can use any string as a name, but
URLs are not defined to work that way, and the distinction between name
assignment and access protocol assignment is felt to be useful by other
people.

>Obviously that isn't what it is for.

Exactly, that is why we need FPIs or URNs, and not just URLs.

>> Anyhow, a major benefit of having URLs *and* FPIs is redundancy. I think
>> that they should usually be used together.
>
>We don't need redundant syntax.  We need a minimal language that people
>actually implement.

So why not be compatible with URNs which are in the process of acquiring at
least 2 distinct resolution protocols...

>> The problem FPIs partially solve
>> is *hard*. FPIs provide a partial solution that URLs do not.
>
>No.  FPIs plus the SGML OPEN catalog plus URLs provide a partial solution.
>FPIs by themselves are nothing but structured strings, just like URLs,
>URNs, telephone numbers and those knotted Mayan things :-)

Incan strings. FPIs provide the administrative mechnism for name
assignment. The lookup problem is separate. Neither URLs, Quipus (Knotted
strings), DNS names, or telephone numbers have an administrative
infrastructure supporting persisten unique names.

>> >    There is no reason why a client-side URL lookaside table could not be
>> >    used to give this functionality with URLs.
>>
>> As I stated before, I think that this is attempting to make URLs into
>> something they are not. I don't really feel comfortable having clients
>> redirect URL accesses.  With a separate syntax it is clear from the document
>> that a redirection is possible, if your tool supports it.
>
>But then every XML program needs to support the SGML OPEN catalog,
>and we have to work out how to associate an SGML OPEN catalog with
>every XML file, and how to do that in the presence of CGI, and how
>to store multuple XML document sets in the same directory without
>CATALOG conflict, and document it all, and still be under 20pp.

No, every XML program that wants to support catalogs must support catalogs.

>I am not opposed to solving the problems that need to be solved, but only
>to confusing functionality and requirements with mechanism.
>If this is what we need to do, we'll do it.

   The URN requirements docs have a good explanation of the issues.

   -- David

I am not a number. I am an undefined character.
_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________
Received on Friday, 29 November 1996 13:59:28 UTC