W3C home > Mailing lists > Public > public-vocabs@w3.org > February 2015

Re: JSON-LD onsite examples: are @context values missing a trailing slash?

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Thu, 19 Feb 2015 12:25:51 -0800
Cc: Kingsley Idehen <kidehen@openlinksw.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>
Message-Id: <425DE9F8-4CBE-4CC9-864C-7171ACDB4D37@greggkellogg.net>
To: Dan Brickley <danbri@google.com>
> On Feb 19, 2015, at 6:39 AM, Dan Brickley <danbri@google.com> wrote:
> 
> On 17 February 2015 at 23:30, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> This isn't a JSON-LD heuristic, but a general web server mechanism to handle
>> ill-formed URLs. As it happens, although it's common practice,
>> http://schema.org is not a valid URL, as it doesn't have a path component.
> 
> Which URL spec are we going by here? Lots of URLs lack path components.
> 
> e.g. 3.3 of https://tools.ietf.org/html/rfc3986 says
> "If a URI contains an authority component, then the path component
>   must either be empty or begin with a slash ("/") character."
> 
> Or HTTP URIs per RFC-7230 "Hypertext Transfer Protocol (HTTP/1.1):
> Message Syntax and Routing" which largely defers to that work:
> 
> http://tools.ietf.org/html/rfc7230#section-2.7
> 
> """
> 2.7.  Uniform Resource Identifiers
> 
>   Uniform Resource Identifiers (URIs) [RFC3986] are used throughout
>   HTTP as the means for identifying resources (Section 2 of [RFC7231]).
>   URI references are used to target requests, indicate redirects, and
>   define relationships.
> 
>   The definitions of "URI-reference", "absolute-URI", "relative-part",
>   "scheme", "authority", "port", "host", "path-abempty", "segment",
>   "query", and "fragment" are adopted from the URI generic syntax.  An
>   "absolute-path" rule is defined for protocol elements that can
>   contain a non-empty path component.  (This rule differs slightly from
>   the path-abempty rule of RFC 3986, which allows for an empty path to
>   be used in references, and path-absolute rule, which does not allow
>   paths that begin with "//".)  A "partial-URI" rule is defined for
>   protocol elements that can contain a relative URI but not a fragment
>   component.
> 
>     URI-reference = <URI-reference, see [RFC3986], Section 4.1>
>     absolute-URI  = <absolute-URI, see [RFC3986], Section 4.3>
>     relative-part = <relative-part, see [RFC3986], Section 4.2>
>     scheme        = <scheme, see [RFC3986], Section 3.1>
>     authority     = <authority, see [RFC3986], Section 3.2>
>     uri-host      = <host, see [RFC3986], Section 3.2.2>
>     port          = <port, see [RFC3986], Section 3.2.3>
>     path-abempty  = <path-abempty, see [RFC3986], Section 3.3>
>     segment       = <segment, see [RFC3986], Section 3.3>
>     query         = <query, see [RFC3986], Section 3.4>
>     fragment      = <fragment, see [RFC3986], Section 3.5>
> 
>     absolute-path = 1*( "/" segment )
>     partial-URI   = relative-part [ "?" query ]
> 
>   Each protocol element in HTTP that allows a URI reference will
>   indicate in its ABNF production whether the element allows any form
>   of reference (URI-reference), only a URI in absolute form
>   (absolute-URI), only the path and optional query components, or some
>   combination of the above.  Unless otherwise indicated, URI references
>   are parsed relative to the effective request URI (Section 5.5).
> """
> 
> As far as I can see the absolute-path construction is only used in
> non-URL settings i.e. protocol headers.
> 
> My reading is that in JSON-LD 'http://schema.org' serves to identify
> an URL from which a context can be acquired. We have wired up the
> relevant server-side voodoo such that this works e.g. via: curl -H
> "Accept: application/ld+json" http://schema.org
> 
> ... where is it written that http://schema.org is a bad http URL?
> (genuine question not rhetorical :)

Yes, I stand corrected, path-abempty is defined as *( "/" segment) so the empty string is indeed legitimate; however, it is not in normal form.  From 6.2.3 [1]:

[[[
 In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/".
]]]

As Kingsley points out, though <http://schema.org> and <http://schema.org/> are two different resources in the strict RDF sense. From a linked data perspective, how this is interpreted depends on  the response is a 200, 301, 303 or 304 in terms of how how relative URLs in the retrieved resource are interpreted. With all intention of averting a perma-thread, RFC3986 does indicate that a redirect from, say, <http://schema.org> to <http://schema.org/> indicates that they are equivalent resources. Not redirecting doesn't really say anything. It also says that when comparing URLs they should be normalized, and in 6.2.3 [1] specifically says that <http://example.com> and <http://example.com/> are equivalent (because of the definition of the HTTP scheme).

RFC2616 3.2.2 goes on to say [2]:

[[[
If the abs_path is not present in the URL, it MUST be given as "/" when used as a Request-URI for a resource (
section 5.1.2)
]]]

But, this is how a server should treat such a request.

It also says:

[[[
An empty abs_path is equivalent to an abs_path of "/".
]]]

This comes down to what should be promoted as best practice, but IMO, the <http://schema.org> meme is out there in the public, and isn't likely to go away any time soon. In our examples and literature, however, it might be best to stick with using <http://schema.org/>.

In any case, for the use in @context, http://schema.org is legitimate (though not normal) and will be interpreted by implementations the same as http://schema.org/. For use in @vocab, or in a prefix definition, the trailing slash _is_ important.

Gregg

[1] http://tools.ietf.org/html/rfc3986#section-6.2.3
[2] http://tools.ietf.org/html/rfc2616#section-3.2.2
 
> An equal counter question: where is it written that such an url would
> be dereferenced by requesting '/' ? Or is this just a convention?
> 
> Dan
> 
> ps. anyone studied HTTP2's for these issues yet?
> https://tools.ietf.org/html/draft-ietf-httpbis-http2-17
Received on Thursday, 19 February 2015 20:26:22 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 19 February 2015 20:26:23 UTC