- From: Austin William Wright <aaa@bzfx.net>
- Date: Sat, 4 Oct 2014 20:07:07 -0700
- To: Larry Masinter <masinter@adobe.com>
- Cc: John C Klensin <klensin@jck.com>, David Sheets <kosmo.zb@gmail.com>, Sam Ruby <rubys@intertwingly.net>, "public-urispec@w3.org" <public-urispec@w3.org>, Anne van Kesteren <annevk@annevk.nl>
- Message-ID: <CANkuk-WVFcRrceATnx=P1AEKSKFEYH5rWXUPPnPbTqmkwGcgNA@mail.gmail.com>
On Fri, Oct 3, 2014 at 5:47 PM, Larry Masinter <masinter@adobe.com> wrote: > I think working on the problem statement is a good idea. > > I raised an issue trying to be neutral about who is 'right', just > referencing (indirectly) the issues and specs. > > https://github.com/urispec/urispec/issues/1 > > I propose the following problem statement: Many software applications utilize universal identifiers so that systems can refer to resources residing in other systems entirely. The URI, while seeing near-universal adoption, has many subtle inconsistencies in implementations that threaten the ability for different systems to refer to each other's resources, creating fragmentation and development of workarounds. A mission statement (charter?) would follow: To promote the convergence of the behavior of universal identifiers across all applications, by identifying inconsistencies and proposing resolutions, including in: Databases, Web browsers, other Web user-agents, XML parsers, a plethora of JSON Hypermedia formats like JSON Schema and Hydra, Semantic Web applications and file formats like Turtle, protocols like HTTP and CoAP, databases, compact notations like CURIE, and more. Part of the problem is we need to be absolutely crystal-clear and consistent about what terms we use. There's many specs _about_ identifiers, but they all do different things: * URI: Authoritatively defined in RFC3986, being a 7-bit ASCII string consisting of a scheme, colon, then hier-part, then optional query and optional fragment. This is by far the most cited and implemented specification for network-addressable identifiers, and possibly one of the most cited RFCs period (if not RFC2119). * IRI: Defined in RFC3987 as a generalization of the URI, how to generalize the URI with Unicode. * URL: The URI was created as a generalization of the URL, though as of RFC3986 it's defined in terms of the URI, as the subset of the URIs that are network-addressable (i.e. "//"). Most times people want a URI, they really want a URL, and more specifically, an HTTP URL. * URN: Likewise defined in terms of the URI, as a subset consisting only of URIs that are not network addressable. Because an IRI, URI, URL, and URN all contain a scheme, they are called "absolute". Some standards defined their own set of strings largely compatible with URIs, mostly for technical reasons. For example, RDF 1.0 for example defined "RDF URI References" due to predating RFC3986 (so named despite being absolute). RDF 1.1 now formally uses IRIs. There's also the class of strings called URI References, or URIRefs for short. They are resolved against an absolute URI, and thus said to be "relative". The same term exists for IRIs. It tends to be called a URI Reference even if the class in question is a IRI or URL, though this doesn't produce any ambiguity to my knowledge. If we need to talk about how Web browsers implement URIs (or implement it _differently_), I propose the term Web Browser Address. I might adopt the acronym WBA. For the concept of the URI, URL, IRI, etc, where the meaning of the string is uniform across time and space (as opposed to document-local ids), I will simply use the term "identifier" or "universal identifier". I would propose the following deliverables: (1) Can we formally adopt this terminology? (An adoption does not mean we are re-defining the term and further adding to confusion, but we'd be saying "That over there, found in pre-existing normative text, is the authoritative definition we refer to.") Any corrections or additions? Would I be correct in saying this is value-free? (2) What, exactly, are the incompatibilities between implementations? Why do Web browsers have a different spec or implementation *at all*? ... Or at least number (2) (as I proposed earlier). Austin.
Received on Sunday, 5 October 2014 03:07:35 UTC