- From: <Patrick.Stickler@nokia.com>
- Date: Fri, 8 Jun 2001 10:52:45 +0300
- To: sandro@w3.org, www-rdf-interest@w3.org
- Cc: Ora.Lassila@nokia.com
[Sandro, I replied to this via the group as I considered it a continuation of the discussion thread... hope you don't mind] > >> > Sandro, the tag FAQ and Tutorial are not accessible from the > >> > URL you specified. It simply says "coming soon". Can you > >> > update the web page or provide copies as attachments? > >> > >> That's odd. I don't know where in the infrastructure an old copy > >> could be. You've tried shift-reload and that kind of client-side > >> stuff? > > > >Yes. It's not there. Maybe an older copy overwote the newer? > > I've tried it from four different multi-user systems on which I > happen to have shell acounts (all run by totally different > organizations), and it works fine on all of them. (I typed "lynx > taguri.org" and each one displayed a page with the tutorial.) My apologies. Don't know what is up with my browser (IE5) but it works now. I had tried both reload and setting prefs to 'check every time' but it still gave me the old page. I've rebooted since then and now it came up straight away with the new version. More Microsoft mysteries... > >I don't see how binding or not binding a tag to a data > stream disqualfies > >tags as a URN, since the intended use for tags are as names, > not locations. > > Dan Connolly tells me that the URI Working Group came to the > conclusion that the notion of URLs was obsolete, since the web > infrastructure now uses them as URIs anyway (finding the closest > server, etc). Arggh! My God! What are they thinking! Or have I totally lost touch ;-) A URL is a URI that directly represents a web resource (a MIME data stream) and a URN is a URI that indirectly represents a web resource *or* an abstract concept, and which *might* in a given context be mapped to a URL for the actual resource, if it is not abstract. A URL is a location, an address of content. A URN is a name. The address does not identify the content, it only locates it. The name does not locate the content, it only identifies it. This is a fundamental distinction that should *never* be lost. Either to say that a URN "functions" as a URL or that a URL "functions" as a URN is to totally abandon this distinction, and if that is what the URI Working Group is thinking, then God help the internet! Just because a URN has a mapping in some context to a URL or can be resolved by some agent to a MIME stream, does *not* make it a URL! And just because someone might use a URL as a URN, does not mean that that is a valid URL. It is not if it does not define the *address* of some content. I wonder if Dan et.al. are trying to rationalize away the common abuse of URLs for universal names for abstract concepts by blurring up the original clear distinctions between URL and URN. Sad. Very sad. What to do....??? Sigh. > If you're saying that the binding between a URN and its denotation > (such as a web page) is not constrained to be constant across time and > space, then it sounds like the notion of URN is obsolete, too, since > URNs are operationally identical to all other URIs. Perhaps the > difference is simply that with URNs one is expected to do some > rewriting and redirecting closer to the client? Not at all obsolete. A URN is meant to provide a location independent identity to some content. That content may infact be stored redundantly in numerous places, and the resolution agent within a given context might choose from the "closest" (in internet distances) *address* from which that uniquely defined information might be retrieved. This is e.g. exactly what was intended by URN schemes such as 'isbn'. Taking the more "modern" case of eBooks, various book retailers, libraries, etc. may all store their own local copy of a given eBook, each with a different location (URL) yet all use the same URN, based e.g. on the ISBN of the eBook to identify the publication. Persons requesting a copy of that publication (for loan, purchase, whatever) would simply be able to request the book by its URN, and each environment (for each retailer, library, etc.) maps that URN to the URL for that content within their same environment. For inter-library loans, the "interchange" is simply a URN and two URLs, one for the source library and one for the target library. This distinction of name versus address is crucial if we are to avoid an exponential explosion in complexity when trying to define equivalencies between URLs in the abscense of universal names, especially as locations change yet names should not! Having universal names enables the proper scalability of the SW, leaving each context to manage its own mappings from names of non-abstract web resources to locations. If we do away with the common names, then every context must define a mapping between its own locations for resources and every other location of the same resource in order to achieve any semblence of interoperability! It will be a scalability nightmare! Single case in point: using a URL as a name for ISO 639 language "Finnish". There is no single official URL for the definition of the ISO 639 standard. But there are many, many URLs where it is re-iterated in some web accessible format. What if everyone chooses a different URL for their "authority", some choose the Oasis site. Some an IETF RFC. Others a W3C note. Others some publisher's site. etc. etc. How then does any agent on the SW *know* that another agent *means* ISO 639 language "Finnish" if it does not have the mapping from that agents preferred "URL name" (an oxymoron!) to that used by itself?!!! It can't! Even if all agents agreed on a single URL to use as a name, which is essential for the SW to work, then (a) the name inherits all of the fragility of a URL, being a location/address, and (b) because it is a URL, there will be the likelyhood that some explicit schema will be located at the common "namespace" portion of the URL and fragment identifiers will be used to define the sub-components of the "namespace", and those fragments will be MIME content type specific and thus are both unreasonably tied to a given MIME content type (since they are names, not references) and also not guarunteed to remain valid over time if e.g. the schema encoding for the ontology changes, and finally there is no guaruntee of compatibility between various serialization/schema interpretations of "namespace" + "name" and that used for the actual names! In short, using URLs or URL refs as names brings chaos not order to the SW. I'm really wondering what the W3C and IETF are thinking by abandoning this critical distinction between URL vs. URN as a strict partitioning of URI schemes. The intersected diagram of URI types and the view that URIs can sometimes be URLs and sometimes be URNs is very very disturbing. > >> Step 3. Encode the date. In theory, tags could use dates > of the form > >> "2001-06-05", but we decided to save people a lot of > typing by using > >> a shorter notation for dates. Instead, we write the date > I picked as > >> "1-6-5". We also say that the first day of a month and > the first day > >> of a year have a further modification: you drop the "1" > fields, so > >> January 1, 2007 is written as "7". > > > >Firstly, by not using ISO standard date formats, you require software > >that might wish to "understand" the date to implement your > new proprietary > >encoding for dates. Bad. > > There is no reason to ever read the dates. Tags are opaque strings > to all software. But in your argument for the existence of dates, you said the purpose was specifically to *differentiate* between tags generated by you and e.g. your grandson. If you don't read the date, then you cannot create an ordering of the tags, nor can you compare the date of the tag with e.g. the period of your life versus that of your grandson, etc. The dates must be read if inferences are to be drawn about the temporal relationships between tags as a basis for determining the minter. Eh? > >Secondly, it is ambiguous as to whether it is > >year-month-day or year-day-month. Even though the tag spec says which > >is which, folks in Europe will hate you ;-) > > We just used the ISO ordering. But one must enter the dates, and since the average person will look for examples to guide him/her (who reads the manual ;-) there is great potential for confusion. Even though there is the ISO ordering, the possible encodings of dates are potentially ambiguous therefore their utility as examples for immitation is greatly diminished. > >It's not *that* hard to write, and anyway, you could make a > cute little > >utility to autogenerate your tags for you. > > It's not a question of generation. I use ISO dates in filenames all > the time, etc. It's a question of the world being plastered with > tags, 98% of which could either use "2001-01-01" or "1". I see little > reason to waste 9 character positions on billions of written > instances. For just RDF, which obviously doesn't care about such > things, I wouldn't care about that, it's true. A greater problem the ambiguity of ordering between day and month, etc. is the fact that the notation "logic" for compressing the dates is too difficult for any average user to be willing to learn and apply. It's clever, but requires too much thought for "the average Joe". Folks know the ISO format. It requires no thought at all. And the consistent format reinforces the proper syntax of the identifier. The range of possible variation in date encodings will put average folks off. If they simply see e.g. 'name:<myEmailAddress,<####-##-##>:<name>' again and again, then they will use it. If they see all kinds of strange and (to the average person) arcane variations in the date encoding, they will say "too difficult" and go on abusing URLs... > One of the problems with using ISO dates is that people will assume > they mean something related to the tag, as opposed to simply naming a > time the authority name is valid. If you see > tag:heinz.com,2001-04-30:baked-beans > you might well think that date had SOMETHING to do with the beans, or > at least the time Heinz introduced that kind of beans. But of course > it doesn't. So I think > tag:heinz.com,1:baked-beans > is more appropriate. I can see your motivations for compressing the date encoding, but I just don't see the average web user adopting such a methodology. If these are supposed to be human-minted names, based on someone having to think about what the name should be, then the date compression is just too complex for broad adoption (IMHO of course). The potential for folks to mis-interpret the date as anything other than the date of minting should simply be addressed (probably pedantically) in documentation, tutorials, etc. > ... > >You'll probably want to constrain what this name substring portion > >can include a little, possibly excluding whitespace and special > >characters, etc. Otherwise, it could choke various tools and > applications > >and lead to technically valid yet unintended abuse of the URI scheme. > > In the actual spec we limit it to URIChar*, of course. Right. Missed that. > >and perhaps a semantically "pre-loaded" scheme > >identifier 'id' were used, > > I like the name id, too. I'm also fond of token. But tag seems > okay, and it's getting know by that. I first called in "tann" for > Time/Authoirty-Name/Name. The name of the URI is really an issue of "marketing". Actually, I am thinking that 'name' (as used above) would be the easiest to "sell" to the general public. Still, going back to my argument about whether tags are URNs, I'd say they definitely are, and that what may actually be needed is the proper support for the 'urn:' URI scheme, and that tags could be one valid urn: sub-scheme and names could be another, with different purposes (the former being arbitrary, temporally bound identifiers and the latter being universal names for abstract concepts). Thus: urn:tag:heinz.com,2001-04-30:baked-beans and urn:name:metia.nokia.com:MARS/2.1/status/approved though if we simply make the date optional (for when needed, we could probably sell a single URN/URI more easily, e.g.: name:heinz.com,2001-04-30:baked-beans name:patrick.stickler@nokia.com:myDataTypes/gazonka-big name:patrick.stickler@nokia.com,2001-05-22:myMotorcycle name:metia.nokia.com:MARS/2.1/status/approved name:dublincore.org/elements/1.1/Title name:dublincore.org/elements/1.1/Creator name:prismstandard.org/1.0/creationTime name:iso.ch/3166-1/fi (the country "Finland") name:iso.ch/639/fi (the language "Finnish") Eh? (all of the above are merely examples, apologies to the various authorites mentioned) > >"id:metia.nokia.com:MARS/2.1/status/approved", then we could have a > > But what happens when Nokia loses a trademark battle with M&M MARS Co, > which legally gets nokia.com, etc, etc. ? Without the date you > constraint future domain holders in a way which may be neither legal > nor practical -- what if Nokia looses the records of what names it has > minted? With tags, they would just start using some later date, > probably the most recent year start or "2" (assuming we're into 2002 > by now). This is simply part of the larger issue of trademarks, product names, and copyrights -- and there are lots of guidelines and precidents to apply to such cases. The same argument applies to URLs used as URNs, to tags, and any other public data used by a business, person or other entity. The presence or absence of a date in the URN will likely have no significance in whether or not Nokia could continue to use it, as the presumed trademark infringement is not because of the date. > As for the Semantic Web.... well,.... yeah. Something like this > could be nice. :-) Agreed. BTW, I'll be in Boston (Burlington) next week, and would have the evening of the 14th (Thursday) free. Would you be interested in getting together somewhere for a chat and e.g. a few beers? Maybe Ora and some of the other local RDF folks might want to join us? Cheers, Patrick -- Patrick Stickler Phone: +358 3 356 0209 Senior Research Scientist Mobile: +358 50 483 9453 Software Technology Laboratory Fax: +358 7180 35409 Nokia Research Center Video: +358 3 356 0209 / 4227 Visiokatu 1, 33720 Tampere, Finland Email: patrick.stickler@nokia.com
Received on Friday, 8 June 2001 04:24:38 UTC