- From: Bullard, Claude L (Len) <clbullar@ingr.com>
- Date: Thu, 22 May 2003 16:54:26 -0500
- To: "'Paul Prescod'" <paul@prescod.net>, WWW-Tag <www-tag@w3.org>
Then maybe the TAG should be talking about getting rid of URNs altogether, or explaining that HTTP really is meaningless to systems that provide PUBLIC to SYSTEM catalog mapping? PUBLIC identifiers were about systems that assigned names but said nothing about resolution, eg, identity is assigned. That is the semantic. SYSTEM identifiers were about system specific locations of entities. Resolving an address is the semantic. HTTP identifiers are about names that identify locations of entities. They are a SYSTEM id. If one wants to use them as a name, they have two semantics. Fine. HTTP is a protocol identifier. Saying it is a meaningless string until it gets handed to an HTTP handler doen't add much to clarify the situation. It just means the semantic to be implemented is in the handler and is fuzzy in the spec because the namespace specification fuzz'd it. In essence, it makes no difference what goes in that namespace id value as long as it is unique within scope. So why did Tim bother to go to xml.gov, and what of value or clarification did he tell them since there is no reason to prefer any string over another in there given a policy for mapping it to a handler, the semantic of which is indeterminate for the purpose of it being a namespace identifier? How can one spec generate so much nonsense? len -----Original Message----- From: Paul Prescod [mailto:paul@prescod.net] Sent: Thursday, May 22, 2003 4:34 PM To: Bullard, Claude L (Len); WWW-Tag Subject: Re: Talked to the xml.gov people Bullard, Claude L (Len) wrote: > Then why does it make a bit of difference what they use > as the string? One string has more information than the other. It says: "if you want more information about this object and you don't know where to find it, use the HTTP protocol and see what you can find out. It's as simple as that: one string has more information than the other. There are some URN syntaxes that embed HTTP URIs and therefore add yet more syntax. I think that those are reasonable although I don't think they offer much advantage. > o URL HTTP because they MIGHT want to dereference it and as > experience proves, HTTP URLs are always dererefenceable even > if they return 404. The policy is global and implemented in > every browser of interest. Fair enough. Note that you could also dereference an HTTP URL using a catalog or registry. For instance, Google's archive is a nice catalog that gives you alternate (historical) representations of HTTP URLs. And SGML SOCATs explicitly allow mapping from system identifiers to system identifiers. "The SYSTEM keyword indicates that an entity manager should use the associated storage object identifier to locate the replacement text for an entity whose external identifier's system identifier is explicitly specified by the system identifier." So it isn't just theoretically possible, it is implemented in nsgmls, jade and other SP-based tools. /tmp/sptest> cat CATALOG SYSTEM "http://www.w3.org/foo.dtd" "b.ent" /tmp/sptest> cat test.sgm <!DOCTYPE foo[ <!ELEMENT foo - - (#PCDATA)> <!ENTITY bar SYSTEM "http://www.w3.org/foo.dtd"> ]> <foo> &bar; </foo> /tmp/sptest> cat b.ent Len /tmp/sptest> onsgmls test.sgm (FOO -Len\n )FOO C > It's a system trap either way except that the URN gives > the owner the ultimate choice as to what dereferencing > mechanism is used and the W3C more or less owns HTTP. The application processing the data _always_ has the ultimate choice how to dereference and can choose NOT to use HTTP, as SP does above. But given an opaque URN they do NOT have the choice of ripping it apart to find an internet resource they can consult for help. One way gives MORE FUNCTIONALITY than the other. > The rest of us have also watched this sleight of hand > long enough and we do get it. It simply comes down > to the single system ambitions of the W3C and whether > or not xml.gov buys into that. If they do, then they > should use a URL (no, not URN, no not URI, no not IRI) > and put something at the end of it to keep from > confusing those who don't get it. Otherwise, use a > URN and maintain absolute content independence of > the system. Choose one. The HTTP URI/URL is context dependent and does not require the HTTP protocol. I just unplugged my computer from the network and tried the trick above and it still worked. It DOES NOT DEPEND on the W3C or HTTP. It is just a string of characters and how you interpret it is up to you. If you want to interpret it as an index into a catalog, more power to you: so does Apache. So does Squid. So do IE and Mozilla (when they are looking into its cache, rather than downloading). This is running code, not theory. IIRC it worked this way from some time in the mid-90s. Paul Prescod
Received on Thursday, 22 May 2003 17:54:34 UTC