- From: David G. Durand <dgd@cs.bu.edu>
- Date: Fri, 29 Nov 1996 14:05:15 -0500
- To: w3c-sgml-wg@w3.org
[Quick summary before the blow-by-blow: The issues about FPI/URN versus URL semantics we have been discussing are divisive. On the URN list I have watched years of such discussions with barely a position changed on the issues. If we take the URI approach, we will have a way to a truce, painlessly. Or we could add PUBLIC, but we will not ever get anything like consensus that this is useful, only acclaim form those who want it and acquiescence from those who believe it misguided. I add a new suggestion, that by allowing a _list_ of URIs in the SYSTEM ID, we can have all the advantages of PUBLIC, without the drawback of the new syntax. (I don't know what this drawback is, but some people obviously feel that there is one).] Prologue to the blow by blow: I have been on the URN list for at least 3 1/2 -4 years. As I remember, most of that time was taken up by discussions of how URLs and URNs are really the same thing, only "they work". Much of this discussion reflects an implementors assumption that there must be _one single_ resolution mechanism for names. The idea of names, however, is that the _do not_ depend on a single resolution mechanism. They are an attempt to _label_ resources. A unique label can help you to find something, but cannot guarantee that any particular resolution mechnism will work to find what you seek. This is in the nature of things. Nonetheless, many people find names to be useful. It seems self-evidently correct to me that we _need_ names. It also seems self-evidently mystical, unimplementable, and foolish to the other side. On the URN list I remember no-one changing their minds in years of discussion of these issues. That experience motivates my proposal that we enable URI's in XML, and lobby for an FPI URN syntax now -- the URN proposals are mature enough that we could do this pretty safely. Then those of us who want names would have them. We can then banish the philosophical issues from the XML effort, without shorting either side permanently. The only thing that is wrong with this proposal, is that we lose SGML's fail-safe (a SYSTEM identifier to be used if you can't resolve the PUBLIC identifier). We could keep my compromise proposal and still get this benefit, by allowing more than one URI in our SYSTEM field. The rule would be that _any_ of the URIs in the field could be used, depending on application preference. -- David At 11:40 PM 11/28/96, lee@sq.com wrote: >There are several things that are being confused here: > >[A] whether to allow PUBLIC "some string" syntax in XML; >[B] if such syntax is used, whether to make use of it; >[C] if such syntax is used, whether to retain SYSTEM identifiers; >[D] if such syntax is used, how to interpret the string; >[E] if the string is to be used and interpreted, whether to use > the SGML OPEN CATALOG file format, or a subset of it, to convert > the string into a sequence of data octets >notation discussion deleted >I think that for [A], HTML compatibility may be a good reason to >vote Yes, but that for [B], I have still seen no compelling reasons. >The strongest has been Debbie & Paul P. wanting to be able to >change the meaning of FPIs on the client side. This is convenient if >you have a local copy of something (URN-like), and not very SGMLish >if your copy differs from the "authoritative" copy. In both cases, you >should really be overriding a SYSTEM identifier. Why should you have to change a document to read or preocess it, if you have a copy of a uniquely named resource, and a unique name referring to that resource in the document? You are making a hidden assumption that FPIs don't work by assuming that the local copy will be different. Given the number of people who have already claimed that this procedure _works_ for them, the assumption is not only hidden, but unwarranted (in the sense that counterexamples exist). >If you are referring to the Latin 1 entities and their esteemed colleagues, >though, I agree that it is convenient to have a local copy, and this >can save a not inconsiderable delay in network latency right now. But there are lots of things that have the same properties as the SDATA entities, in different contexts: The TEI DTDs, the CALS DTDs, DocBook, etc. etc. This is one of the things people naturally do. >The difficulty is in having a sufficiently small set of features >to do what is needed without losing too much functionality. > >So, here is my question: > What mechanisms are proposed for including SDATA entity sets? > > I have just agreed to chair an SGML OPEN group trying to come up > with a way of specifying the replacement text for SDATA entities; > if that can be system independent, it may be very useful in a > future version of XML. I should be useful in a current version of XML -- I am waiting for David Birnbaum to come up to date, so he can comment on the desirability of describing characters by strings rather than by numbers. For that matter, SDATA entities fit nicely with the DSSSL character model, which has an open set of characters denoted by strings, doesn't it? > In the meantime, it is likely that you will need to have your own > version of the Latin 1 entities, with replacement text suited for > your local system. This may or may not be the case: some systems use the ISO SDATA text already! > So how do you find it? by matching FPIs according to _some_ resolution system. The resolution system is _NOT_ a part of XML, because names are intentionally decoupled from resolution systems. > Is it useful to be able to override other FPIs? > Please give real, concrete examples. > >If your name isn't Paul Prescod :-), you can stop reading now... Sorry, but if it's on the list, it's fair game. >Paul Prescod <papresco@csclub.uwaterloo.ca> >> The second is that FPIs can be "redirected" by the information consumer or >> maintainer, where as URLs can only be redirected by the information >> provider. > >I already pointed out that this isn't necessarily true. >Also, you are making an assumption about FPIs -- that the user can affect >their resolution. There is nothing in ISO8879 that mandates that -- >quite the reverse, if anytyhing. I haven't checked the FPI standard, >as I don't have access to a copy right now (sorry). redirecting HTTP URLs is certainly possible, but URLs do have a semantics defined, which states a particular resolution mechanism, which is dependend on DNS names, specific data transfer protocols, etc. An FPI does not have _one_ resolution mechanism, so it can be resolved by the user un-"aided" by hardired resolution mechanisms that may not be convenient (like IP connections from a notebook). >> I suppose you could redirect >> URLs, but that seems like a nasty semantic mixup. If an identifier has a >> machine name in it, then the information should, at least logically, come >> from the machine (even if a cache "represents" that machine). >(1) URLs do not contain machine names except by coincidence. > They start with Internet domain names. >(2) A URL is a coordinate in cyberspace. To put it another way, a URL > is indeed a resource's location, but that needn't be a resource > that exists. 1. if you read the URL spec, it tells you how to resolve the name, and it involves DNS. DNS is _not_ a stable namespace, and does _not_ guarantee that DNS names will not be reassigned. 2. If we resolve some URLs by a different mechanism, that is location-independent, and resolution mechanism independent, then we simply have FPIs under another name. >> Some people feel strongly that there should be a syntactic differentiation >> between *names* and *locations*. > >Possibly. We will have that with URNs. Yes, but the current XML spec does not allow URNs. That is one way to address this problem. That is a fix that I have been re-iterating, and re-iterating. I am willing to act like a broken record, but would rather not. >> A concept cannot have a URL. It can only >> have a name, or perhaps a URN. > >I think this shows that you are confused about URLs -- but we should >take that discussion offline. >It is not meaningful to say that a concept has a URL. It is meaningful >to say that you can use a URL to represent a concept, though. >If you must have it point at something, point at a textual description. >> I think that the first time you "click" on a >> "concept:" you'll wish that they were distinct also. >I was referring to URNs, not the HREF attribute of an HTML <A> element. >You are going to have a hard time dereferencing your TeX Book example >FPI too, if you "click on it". Concept is not a valid URL protocol. The fact that that prefix is called a protocol prefix tells you something about the semantic intent of the field, too. Paul is making the point that URLs are supposed to index a resource via a resolution mechanism. Obviously we can use any string as a name, but URLs are not defined to work that way, and the distinction between name assignment and access protocol assignment is felt to be useful by other people. >Obviously that isn't what it is for. Exactly, that is why we need FPIs or URNs, and not just URLs. >> Anyhow, a major benefit of having URLs *and* FPIs is redundancy. I think >> that they should usually be used together. > >We don't need redundant syntax. We need a minimal language that people >actually implement. So why not be compatible with URNs which are in the process of acquiring at least 2 distinct resolution protocols... >> The problem FPIs partially solve >> is *hard*. FPIs provide a partial solution that URLs do not. > >No. FPIs plus the SGML OPEN catalog plus URLs provide a partial solution. >FPIs by themselves are nothing but structured strings, just like URLs, >URNs, telephone numbers and those knotted Mayan things :-) Incan strings. FPIs provide the administrative mechnism for name assignment. The lookup problem is separate. Neither URLs, Quipus (Knotted strings), DNS names, or telephone numbers have an administrative infrastructure supporting persisten unique names. >> > There is no reason why a client-side URL lookaside table could not be >> > used to give this functionality with URLs. >> >> As I stated before, I think that this is attempting to make URLs into >> something they are not. I don't really feel comfortable having clients >> redirect URL accesses. With a separate syntax it is clear from the document >> that a redirection is possible, if your tool supports it. > >But then every XML program needs to support the SGML OPEN catalog, >and we have to work out how to associate an SGML OPEN catalog with >every XML file, and how to do that in the presence of CGI, and how >to store multuple XML document sets in the same directory without >CATALOG conflict, and document it all, and still be under 20pp. No, every XML program that wants to support catalogs must support catalogs. >I am not opposed to solving the problems that need to be solved, but only >to confusing functionality and requirements with mechanism. >If this is what we need to do, we'll do it. The URN requirements docs have a good explanation of the issues. -- David I am not a number. I am an undefined character. _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Friday, 29 November 1996 13:59:28 UTC