- From: Gregg Kellogg <gregg@greggkellogg.net>
- Date: Thu, 19 Feb 2015 12:25:51 -0800
- To: Dan Brickley <danbri@google.com>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>
> On Feb 19, 2015, at 6:39 AM, Dan Brickley <danbri@google.com> wrote:
>
> On 17 February 2015 at 23:30, Gregg Kellogg <gregg@greggkellogg.net> wrote:
>
>> This isn't a JSON-LD heuristic, but a general web server mechanism to handle
>> ill-formed URLs. As it happens, although it's common practice,
>> http://schema.org is not a valid URL, as it doesn't have a path component.
>
> Which URL spec are we going by here? Lots of URLs lack path components.
>
> e.g. 3.3 of https://tools.ietf.org/html/rfc3986 says
> "If a URI contains an authority component, then the path component
> must either be empty or begin with a slash ("/") character."
>
> Or HTTP URIs per RFC-7230 "Hypertext Transfer Protocol (HTTP/1.1):
> Message Syntax and Routing" which largely defers to that work:
>
> http://tools.ietf.org/html/rfc7230#section-2.7
>
> """
> 2.7. Uniform Resource Identifiers
>
> Uniform Resource Identifiers (URIs) [RFC3986] are used throughout
> HTTP as the means for identifying resources (Section 2 of [RFC7231]).
> URI references are used to target requests, indicate redirects, and
> define relationships.
>
> The definitions of "URI-reference", "absolute-URI", "relative-part",
> "scheme", "authority", "port", "host", "path-abempty", "segment",
> "query", and "fragment" are adopted from the URI generic syntax. An
> "absolute-path" rule is defined for protocol elements that can
> contain a non-empty path component. (This rule differs slightly from
> the path-abempty rule of RFC 3986, which allows for an empty path to
> be used in references, and path-absolute rule, which does not allow
> paths that begin with "//".) A "partial-URI" rule is defined for
> protocol elements that can contain a relative URI but not a fragment
> component.
>
> URI-reference = <URI-reference, see [RFC3986], Section 4.1>
> absolute-URI = <absolute-URI, see [RFC3986], Section 4.3>
> relative-part = <relative-part, see [RFC3986], Section 4.2>
> scheme = <scheme, see [RFC3986], Section 3.1>
> authority = <authority, see [RFC3986], Section 3.2>
> uri-host = <host, see [RFC3986], Section 3.2.2>
> port = <port, see [RFC3986], Section 3.2.3>
> path-abempty = <path-abempty, see [RFC3986], Section 3.3>
> segment = <segment, see [RFC3986], Section 3.3>
> query = <query, see [RFC3986], Section 3.4>
> fragment = <fragment, see [RFC3986], Section 3.5>
>
> absolute-path = 1*( "/" segment )
> partial-URI = relative-part [ "?" query ]
>
> Each protocol element in HTTP that allows a URI reference will
> indicate in its ABNF production whether the element allows any form
> of reference (URI-reference), only a URI in absolute form
> (absolute-URI), only the path and optional query components, or some
> combination of the above. Unless otherwise indicated, URI references
> are parsed relative to the effective request URI (Section 5.5).
> """
>
> As far as I can see the absolute-path construction is only used in
> non-URL settings i.e. protocol headers.
>
> My reading is that in JSON-LD 'http://schema.org' serves to identify
> an URL from which a context can be acquired. We have wired up the
> relevant server-side voodoo such that this works e.g. via: curl -H
> "Accept: application/ld+json" http://schema.org
>
> ... where is it written that http://schema.org is a bad http URL?
> (genuine question not rhetorical :)
Yes, I stand corrected, path-abempty is defined as *( "/" segment) so the empty string is indeed legitimate; however, it is not in normal form. From 6.2.3 [1]:
[[[
In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/".
]]]
As Kingsley points out, though <http://schema.org> and <http://schema.org/> are two different resources in the strict RDF sense. From a linked data perspective, how this is interpreted depends on the response is a 200, 301, 303 or 304 in terms of how how relative URLs in the retrieved resource are interpreted. With all intention of averting a perma-thread, RFC3986 does indicate that a redirect from, say, <http://schema.org> to <http://schema.org/> indicates that they are equivalent resources. Not redirecting doesn't really say anything. It also says that when comparing URLs they should be normalized, and in 6.2.3 [1] specifically says that <http://example.com> and <http://example.com/> are equivalent (because of the definition of the HTTP scheme).
RFC2616 3.2.2 goes on to say [2]:
[[[
If the abs_path is not present in the URL, it MUST be given as "/" when used as a Request-URI for a resource (
section 5.1.2)
]]]
But, this is how a server should treat such a request.
It also says:
[[[
An empty abs_path is equivalent to an abs_path of "/".
]]]
This comes down to what should be promoted as best practice, but IMO, the <http://schema.org> meme is out there in the public, and isn't likely to go away any time soon. In our examples and literature, however, it might be best to stick with using <http://schema.org/>.
In any case, for the use in @context, http://schema.org is legitimate (though not normal) and will be interpreted by implementations the same as http://schema.org/. For use in @vocab, or in a prefix definition, the trailing slash _is_ important.
Gregg
[1] http://tools.ietf.org/html/rfc3986#section-6.2.3
[2] http://tools.ietf.org/html/rfc2616#section-3.2.2
> An equal counter question: where is it written that such an url would
> be dereferenced by requesting '/' ? Or is this just a convention?
>
> Dan
>
> ps. anyone studied HTTP2's for these issues yet?
> https://tools.ietf.org/html/draft-ietf-httpbis-http2-17
Received on Thursday, 19 February 2015 20:26:22 UTC