- From: Gregg Kellogg <gregg@greggkellogg.net>
- Date: Thu, 19 Feb 2015 12:25:51 -0800
- To: Dan Brickley <danbri@google.com>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>
> On Feb 19, 2015, at 6:39 AM, Dan Brickley <danbri@google.com> wrote: > > On 17 February 2015 at 23:30, Gregg Kellogg <gregg@greggkellogg.net> wrote: > >> This isn't a JSON-LD heuristic, but a general web server mechanism to handle >> ill-formed URLs. As it happens, although it's common practice, >> http://schema.org is not a valid URL, as it doesn't have a path component. > > Which URL spec are we going by here? Lots of URLs lack path components. > > e.g. 3.3 of https://tools.ietf.org/html/rfc3986 says > "If a URI contains an authority component, then the path component > must either be empty or begin with a slash ("/") character." > > Or HTTP URIs per RFC-7230 "Hypertext Transfer Protocol (HTTP/1.1): > Message Syntax and Routing" which largely defers to that work: > > http://tools.ietf.org/html/rfc7230#section-2.7 > > """ > 2.7. Uniform Resource Identifiers > > Uniform Resource Identifiers (URIs) [RFC3986] are used throughout > HTTP as the means for identifying resources (Section 2 of [RFC7231]). > URI references are used to target requests, indicate redirects, and > define relationships. > > The definitions of "URI-reference", "absolute-URI", "relative-part", > "scheme", "authority", "port", "host", "path-abempty", "segment", > "query", and "fragment" are adopted from the URI generic syntax. An > "absolute-path" rule is defined for protocol elements that can > contain a non-empty path component. (This rule differs slightly from > the path-abempty rule of RFC 3986, which allows for an empty path to > be used in references, and path-absolute rule, which does not allow > paths that begin with "//".) A "partial-URI" rule is defined for > protocol elements that can contain a relative URI but not a fragment > component. > > URI-reference = <URI-reference, see [RFC3986], Section 4.1> > absolute-URI = <absolute-URI, see [RFC3986], Section 4.3> > relative-part = <relative-part, see [RFC3986], Section 4.2> > scheme = <scheme, see [RFC3986], Section 3.1> > authority = <authority, see [RFC3986], Section 3.2> > uri-host = <host, see [RFC3986], Section 3.2.2> > port = <port, see [RFC3986], Section 3.2.3> > path-abempty = <path-abempty, see [RFC3986], Section 3.3> > segment = <segment, see [RFC3986], Section 3.3> > query = <query, see [RFC3986], Section 3.4> > fragment = <fragment, see [RFC3986], Section 3.5> > > absolute-path = 1*( "/" segment ) > partial-URI = relative-part [ "?" query ] > > Each protocol element in HTTP that allows a URI reference will > indicate in its ABNF production whether the element allows any form > of reference (URI-reference), only a URI in absolute form > (absolute-URI), only the path and optional query components, or some > combination of the above. Unless otherwise indicated, URI references > are parsed relative to the effective request URI (Section 5.5). > """ > > As far as I can see the absolute-path construction is only used in > non-URL settings i.e. protocol headers. > > My reading is that in JSON-LD 'http://schema.org' serves to identify > an URL from which a context can be acquired. We have wired up the > relevant server-side voodoo such that this works e.g. via: curl -H > "Accept: application/ld+json" http://schema.org > > ... where is it written that http://schema.org is a bad http URL? > (genuine question not rhetorical :) Yes, I stand corrected, path-abempty is defined as *( "/" segment) so the empty string is indeed legitimate; however, it is not in normal form. From 6.2.3 [1]: [[[ In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/". ]]] As Kingsley points out, though <http://schema.org> and <http://schema.org/> are two different resources in the strict RDF sense. From a linked data perspective, how this is interpreted depends on the response is a 200, 301, 303 or 304 in terms of how how relative URLs in the retrieved resource are interpreted. With all intention of averting a perma-thread, RFC3986 does indicate that a redirect from, say, <http://schema.org> to <http://schema.org/> indicates that they are equivalent resources. Not redirecting doesn't really say anything. It also says that when comparing URLs they should be normalized, and in 6.2.3 [1] specifically says that <http://example.com> and <http://example.com/> are equivalent (because of the definition of the HTTP scheme). RFC2616 3.2.2 goes on to say [2]: [[[ If the abs_path is not present in the URL, it MUST be given as "/" when used as a Request-URI for a resource ( section 5.1.2) ]]] But, this is how a server should treat such a request. It also says: [[[ An empty abs_path is equivalent to an abs_path of "/". ]]] This comes down to what should be promoted as best practice, but IMO, the <http://schema.org> meme is out there in the public, and isn't likely to go away any time soon. In our examples and literature, however, it might be best to stick with using <http://schema.org/>. In any case, for the use in @context, http://schema.org is legitimate (though not normal) and will be interpreted by implementations the same as http://schema.org/. For use in @vocab, or in a prefix definition, the trailing slash _is_ important. Gregg [1] http://tools.ietf.org/html/rfc3986#section-6.2.3 [2] http://tools.ietf.org/html/rfc2616#section-3.2.2 > An equal counter question: where is it written that such an url would > be dereferenced by requesting '/' ? Or is this just a convention? > > Dan > > ps. anyone studied HTTP2's for these issues yet? > https://tools.ietf.org/html/draft-ietf-httpbis-http2-17
Received on Thursday, 19 February 2015 20:26:22 UTC