- From: Larry Masinter <masinter@adobe.com>
- Date: Tue, 9 Dec 2014 16:59:18 +0000
- To: "julian.reschke@gmx.de" <julian.reschke@gmx.de>, Bjoern Hoehrmann <derhoermi@gmx.net>, Sam Ruby <rubys@intertwingly.net>
- CC: "uri@w3.org" <uri@w3.org>
In response to discussion on public-ietf-w3c: > > * Standardize on the term URL. URI and IRI are just > > confusing. In practice a single algorithm is used for both > > so keeping them distinct is not helping anyone. URL also > > easily wins the [36]search result popularity contest. > > ... > > This ignores the fact that RFC 3986 defines URI as the superset of URNs > and URLs. (And yes, some schemes can be both). > I understand that the browser people are not very interested in URNs, > but many people in the IETF are. Pretending that they do not exist and > that it makes sense to call them URLs will IMHO not work very well. In computer science theory, the role of "identifier" can be played by almost any string or data structure which is communicated to "stand for" something else; the role of "identifier" can further be described as a "location" or a "name" depending on how much the "identifier" corresponds to information useful in computing the location or access method for whatever is being identified. But these are not precisely defined roles. In the history of the web, the terms URL, URI, URN, IRI, and various other even more obscure have been variously used for different constructs, to capture some distinctions: URI = URL + URN: that is, to split the space of identifiers between those that are defined by a 'namespace authority' and thus not a locator, and the rest. Even informally, it is acknowledged that the distinction is fuzzy. IRI vs URI: to separate those that are restricted to sequences of a limited subset of characters (not even all of ASCII) and those that are not, with some ambiguity of which repertoire is or isn't allowed (E.g., spaces), giving to odd constructs like LEIRI (Legacy Extended IRI being IRIs in which spaces are allowed). Relative vs. Absolute: we variously include or exclude relative forms, to be combined with a "Base". These distinctions, while well-intentioned, have also been confusing. What is the name for Identifiers that start with "urn:" but contain non-ASCII characters? If a URN is a URI, then are we also defining IRNs (Internationalized Resource Names)? The question is: what name should we use for what this document defines, and which other constructs should also be defined in this document, vs. leaving alone the current definitions. The documents Sam points us to currently define "URL" as the superset, and includes as a goal to remove RFC 3986 (defining URI) and RFC 3987 (defining IRI), but it doesn't yet include a sufficient new definition of those other terms (URI, IRI) even though they are still in use. I'm OK with using "URL" as the most liberal noun, and introducing qualifiers as adjectives. I’m OK with defining "URN" as "a kind of URL that starts with 'urn:'", and explaining how they're not currently generally useful as locators, although many have had ambition to make them so. That is, we're not claiming they don't exist, but we are claiming that it can make sense to say a URN is a kind of URL, just because that's how we define it. Larry -- http://larry.masinter.net
Received on Tuesday, 9 December 2014 17:00:09 UTC