Re: Interpreting JSON as JSON-LD / Content-Type header spec. requirements from Christopher Johnson on 2017-10-27 (public-linked-json@w3.org from October 2017)

From: Christopher Johnson <chjohnson39@gmail.com>
Date: Fri, 27 Oct 2017 06:55:44 +0200
To: Robert Sanderson <azaroth42@gmail.com>
Cc: Gregg Kellogg <gregg@greggkellogg.net>, Martynas Jusevičius <martynas@atomgraph.com>, Linked JSON <public-linked-json@w3.org>
Message-ID: <CAMJ8WP06x4Zo5DngROEAQ4z6pJO3vnM8YadmHNyKHy+B51aBoQ@mail.gmail.com>
What is the identity criteria for JSON-LD document?

Per the spec:
8. A JSON-LD document must be a single node object or an array whose
elements are each node objects at the top level

8.2 A JSON object is a node object if it exists outside of a JSON-LD
context

A JSON document processed as JSON-LD without a @context will also yield a
null @default graph.  This does not make JSON "valid JSON-LD".

Christopher Johnson
Scientific Associate
Universitätsbibliothek Leipzig

On 26 October 2017 at 21:31, Robert Sanderson <azaroth42@gmail.com> wrote:

>
> A context document without any graph is absolutely valid JSON-LD and thus
> can be served with the JSON-LD media type. You process it and get zero
> triples, but it's still a valid serialization of the empty graph.
>
> In terms of LDP, if you want to manage the context as a separate
> *document* then yes, you need to manage it as a LDP-NR.
>
> Rob
>
>
> On Thu, Oct 26, 2017 at 11:14 AM, Christopher Johnson <
> chjohnson39@gmail.com> wrote:
>
>> By "processing document", I mean only an @context or frame that is served
>> without a graph body via HTTP.  i.e. analogous to  "schema".
>>
>> The server-side processing from and to RDF is facilitated by "localized"
>> persistence of schema as LDP-NRs, so that it does not have to depend on an
>> HTTP client to dereference potentially unknown/threatening entities
>> remotely.  Of course, a client can opt to process the document itself if it
>> does not issue the Accept header and just fetches the graph as RDF.
>> However, as mentioned, the prevailing trend seems to be have the server do
>> the processing to simply the client interaction model.
>>
>> Christopher Johnson
>> Scientific Associate
>> Universitätsbibliothek Leipzig
>>
>> On 26 October 2017 at 19:31, Gregg Kellogg <gregg@greggkellogg.net>
>> wrote:
>>
>>> On Oct 26, 2017, at 1:04 PM, Christopher Johnson <chjohnson39@gmail.com>
>>> wrote:
>>>
>>> Hi list,
>>>
>>> Following up on this:
>>> In a discussion regarding JSON-LD processing documents in relation to an
>>> implementation of LDP, a few specification points have been addressed:
>>>
>>> 1. Since JSON-LD processing documents cannot be deserialized to RDF (and
>>> are therefore not JSON-LD documents), they *MUST* be served with a
>>> Content-Type application/json.
>>>
>>>
>>> Can you clarify this? If the JSON-LD processing document is JSON-LD, it
>>> would include an @context, which would allow it to be interpreted as RDF,
>>> no? application/json can serve a JSON document that can be interpreted as
>>> JSON-LD by linking to a context, or it can return a JSON-LD document
>>> containing @context (or both). Am I missing something?
>>>
>>> No clear what a “processing document” is in relation to LDP.
>>>
>>> 2. As application/json Content-Type, JSON-LD processing documents can
>>> only be persisted in LDP as rdf:type http://www.w3.org/ns/
>>> ldp#NonRDFSource  .
>>>
>>>
>>> This wouldn’t seem to restrict a client’s ability to process it as
>>> JSON-LD, though, would it?
>>>
>>> Gregg
>>>
>>> 3. A (recommended) HTTP header for clients who need to identify a
>>> processing document to a specification compliant LDP implementation
>>> *MAY* be Accept: application/ld+json; profile="http://example.org/co
>>> ntext.json
>>> <http://www.google.com/url?q=http%3A%2F%2Fexample.org%2Fcontext.json&sa=D&sntz=1&usg=AFQjCNGVZtpiz4NkamqiB3uatpw9ZmXeyw>
>>> "
>>>
>>> Christopher Johnson
>>> Scientific Associate
>>> Universitätsbibliothek Leipzig
>>>
>>> On 23 October 2017 at 00:59, Gregg Kellogg <gregg@greggkellogg.net>
>>> wrote:
>>>
>>>> On Oct 22, 2017, at 11:58 AM, Martynas Jusevičius <
>>>> martynas@atomgraph.com> wrote:
>>>>
>>>> Why should LDP be concerned with any specific serialization?
>>>>
>>>>
>>>> I think the concern is likely captured in issue #491. A context cache
>>>> isn’t really something the spec needs to handle explicitly, as it already
>>>> provides the means to specify an alternative document loader, with which a
>>>> client could implement a context cache.
>>>>
>>>> Gregg
>>>>
>>>> On Sun, 22 Oct 2017 at 20.44, Gregg Kellogg <gregg@greggkellogg.net>
>>>> wrote:
>>>>
>>>>> On Oct 21, 2017, at 8:49 PM, Christopher Johnson <
>>>>> chjohnson39@gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks for the reply..  jsonld-java does not execute the remote-doc
>>>>> tests, yet.  Interesting thing about the 0009 test is that the Link
>>>>> references a non-HTTP document.  Processing Link headers is not that easy,
>>>>> but I would assume that the test client should also try to dereference the
>>>>> Link Header using an HTTP client.  I will check how jsonld.js does this.
>>>>>
>>>>>
>>>>> It is HTTP, but should be HTTPS, I’ll update the .htaccess file that
>>>>> sets this. However, it should redirect and work anyway.
>>>>>
>>>>> The rational is buried in the GitHub issue tracker, but as I recall,
>>>>>> application/ld+json is intended to represent a JSON-LD document that can be
>>>>>> fully interpreted based on its content, rather than rely on out-of-band
>>>>>> information. This allows the document to be used outside of it’s HTTP
>>>>>> context (URI base issues aside).
>>>>>
>>>>>
>>>>> By "URI base issues", I assume that you mean when @context is
>>>>> referenced with an IRI in a document.  Perhaps, this form and also the
>>>>> JSON-LD profiles that do not include context (i.e.
>>>>> http://www.w3.org/ns/json-ld#expanded) should be mentioned in the
>>>>> spec as exclusions to the #context constraint if these are intended to also
>>>>> be served as application/ld+json.
>>>>>
>>>>>
>>>>> By “URI base issues”, I mean the resolution of non-absolute
>>>>> document-relative IRIs; but, this is a common issue among any format that
>>>>> uses relative IRIs where the document is used outside of a scheme that can
>>>>> be used to determine the base IRI (URL). Implementations may provide a
>>>>> means of setting this through options. If the `@context` uses a relative
>>>>> IRI, it will have the same issue.
>>>>>
>>>>> However, the spec mostly is concerned with JSON(-LD) documents served
>>>>> over HTTP(S); certainly when the Link header is involved.
>>>>>
>>>>> The concern that I would like to address relates primarily with the
>>>>> "HTTP context" dependency that you indicate.  If a client depends on an
>>>>> processed document format (e.g. framed or compacted) and the @context or
>>>>> frame is only served by HTTP, then any client-side JSON-LD processing
>>>>> (required when a document is served in a non-expected format like RDF or
>>>>> JSON-LD expanded) requires some hypermedia mechanism whereby these
>>>>> dependencies can be dereferenced and validated *before* deserialization.
>>>>>
>>>>>
>>>>>
>>>>> You can also specify the context through the API (both for expansion
>>>>> and compaction). Same with the frame.
>>>>>
>>>>> If dependencies are provided outside the document envelope (Link
>>>>> header or API option), you can certainly do this. If inside the document,
>>>>> then you need to deserialize to find the references (consider an @context
>>>>> embedded within some object).
>>>>>
>>>>> An undefined point is how an LDP server can persist @context in RDF
>>>>> for a particular document graph or document collection.  This seems to be
>>>>> related to a the client expectation of a particular document format that
>>>>> can only be provided by an remote reference (z.B. the "HTTP context")..  A
>>>>> client could issue some form of negotiated "Accept-Processing-Document"
>>>>> request, and the server could respond with an "processing document" IRI,
>>>>> (but only if it had persisted it) …
>>>>>
>>>>>
>>>>> There isn’t a way to represent a context in RDF, per-se, other than
>>>>> when serialized as JSON-LD which includes a top-level @context.
>>>>>
>>>>> These seem like reasonable concerns, however, but perhaps they should
>>>>> be considered in LDP. At most, I could see the API including some form of
>>>>> @context lookup hash that would allow an implementation to look for context
>>>>> references that are in this hash and use them rather than perform a
>>>>> dereference.
>>>>>
>>>>> Furthermore, it seems then that an "HTTP context" is a actually a
>>>>> dependency for client-side processing of the document , and could be
>>>>> serviced similarly to other compile time package dependency managers (e.g.
>>>>> Maven Central, npm, etc.).  This frees the producer from having to provide
>>>>> custom headers for every possible document processing relation.  When
>>>>> deserializing from JSON-LD, the producer could persist a context hash with
>>>>> the graph that pointed to a package, and let the client do the rest.  For
>>>>> LDP implementations, where the data can be serialized dynamically from RDF,
>>>>> the client does have the potential to declare profile preferences, though I
>>>>> am not sure why it would if it were doing the processing.  The protocol
>>>>> regarding negotiation of JSON-LD processing options is undefined at the
>>>>> moment, though this is yet another "feature" that could be considered for
>>>>> server-side negotiated processing.
>>>>>
>>>>>
>>>>> If the server maintains such a cache, then presumably it could be
>>>>> included in a JSON-LD serialization when serialized. It’s similar to prefix
>>>>> management for Turtle, where the service may have a configured set of
>>>>> prefixes it uses when serializing resources.
>>>>>
>>>>> There is also an open-issue to allow the client to request (#491) [1],
>>>>> which could also be used for specifying a context the service should use
>>>>> when serializing compacted/framed JSON-LD (presumably, subject to
>>>>> white/black listing on the server side).
>>>>>
>>>>> It seems a gray area as to where the JSON-LD processing *should* occur.
>>>>> In writing this, I have convinced myself that a clean solution is to have
>>>>> the producer provide expanded or RDF with some externally defined reference
>>>>> to processing options and context dependencies and let the client do the
>>>>> processing into its desired format.  No need for negotiation of
>>>>> preferences!  However, "tailored JSON serialization" is generally what some
>>>>> clients expect, though this level of service certainly has a cost in
>>>>> complexity for the producer.
>>>>>
>>>>>
>>>>> The JSON-LD spec was originally written from the perspective that a
>>>>> client may apply algorithms to the received JSON-LD to put it in the most
>>>>> convenient form for local processing, but I think that, in actuality, it’s
>>>>> more common for clients to work with the results the server sends directly,
>>>>> thus the need for additional request profile information (tying us further
>>>>> to HTTP(S)).
>>>>>
>>>>> This is a complicated enough issue that we might consider creating an
>>>>> ad-hoc conference call to discuss in more detail, or perhaps piggy-back on
>>>>> some existing LDP call (if such exists).
>>>>>
>>>>> Gregg
>>>>>
>>>>> [1] https://github.com/json-ld/json-ld.org/issues/491
>>>>>
>>>>> Christopher Johnson
>>>>> Scientific Associate
>>>>> Universitätsbibliothek Leipzig
>>>>>
>>>>> On 19 October 2017 at 19:03, Gregg Kellogg <gregg@greggkellogg.net>
>>>>> wrote:
>>>>>
>>>>> On Oct 17, 2017, at 8:29 PM, Christopher Johnson <
>>>>>> chjohnson39@gmail.com> wrote:
>>>>>>
>>>>>> Hi list,
>>>>>>
>>>>>> In am writing tests and possibly a
>>>>>> n
>>>>>>  async document loader implementation for jsonld-java that could
>>>>>> check Content-Type and Link headers before fetching.  The current one does
>>>>>> not.
>>>>>>
>>>>>>
>>>>>> There are tests for this in the JSON-LD test suite, specifically
>>>>>> remote-doc-0009-0011.
>>>>>>
>>>>>> > curl -I https://json-ld.org/test-suite
>>>>>> /tests/remote-doc-0009-in.jsonld
>>>>>> HTTP/1.1 200 OK
>>>>>> Accept-Ranges: bytes
>>>>>> Access-Control-Allow-Origin: *
>>>>>> Content-Length: 77
>>>>>> Content-Type: application/ld+json
>>>>>> Date: Thu, 19 Oct 2017 16:54:08 GMT
>>>>>> Etag: "20d78-4d-4e582e01c8079"
>>>>>> Last-Modified: Tue, 03 Sep 2013 23:16:15 GMT
>>>>>> Link: <remote-doc-0009-context.jsonld>; rel="
>>>>>> http://www.w3.org/ns/json-ld#context"
>>>>>> Server: Apache/2.2.22 (Ubuntu)
>>>>>> Vary: Accept-Encoding
>>>>>>
>>>>>>  I would like
>>>>>> to 
>>>>>> research and find actual examples where the JSON-LD specification
>>>>>> about the Interpretation requirement has been observed..  Anyone know a
>>>>>> site that serve
>>>>>> s
>>>>>> Content-Type application/json with a Link header that provides a "
>>>>>> http://www.w3.org/ns/json-ld#context" relation?
>>>>>>
>>>>>> Also could some
>>>>>> one
>>>>>>  please explain why application/ld+json is forbidden to provide an
>>>>>> #context relation with a Link Header?
>>>>>>
>>>>>>
>>>>>> The rational is buried in the GitHub issue tracker, but as I recall,
>>>>>> application/ld+json is intended to represent a JSON-LD document that can be
>>>>>> fully interpreted based on its content, rather than rely on out-of-band
>>>>>> information. This allows the document to be used outside of it’s HTTP
>>>>>> context (URI base issues aside).
>>>>>>
>>>>>> Is an #context relation (if served as application/ld+json) allowed to
>>>>>> specify a  http://www.w3.org/ns/json-ld profile?
>>>>>>
>>>>>>
>>>>>> I think there’s a test for this, where it is specifically ignored
>>>>>> (0009).
>>>>>>
>>>>>> Or could a non http://www..w3.org/ns/json-ld
>>>>>> <http://www.w3.org/ns/json-ld> Content-Type profile be served as
>>>>>> application/json and provide a secondary #context relation if it were
>>>>>> itself an #context ?
>>>>>>
>>>>>>
>>>>>> Need a more specific example of this.
>>>>>>
>>>>>> Should the document loader check (and
>>>>>> subsequently
>>>>>>  dereference the relations) if "second-step" remote documents provide
>>>>>> Link and/or Content-Type profile?  Should there be a processing "order of
>>>>>> precedence”?
>>>>>>
>>>>>>
>>>>>> The remote-doc test manifests  does have some redirect logic, but
>>>>>> you’d need to look more specifically to see if it addresses what you’re
>>>>>> concerned about; we can always add more tests.
>>>>>> https://json-ld.org/test-suite/tests/remote-doc-manifest.jsonld.
>>>>>>
>>>>>> Just trying to understand required hypermedia options and possible
>>>>>> use cases.
>>>>>>
>>>>>>
>>>>>> Happy to help, also you can chat on Giiter or IRC.
>>>>>>
>>>>>> Gregg
>>>>>>
>>>>> Thanks,
>>>>>> Christopher Johnson
>>>>>> Scientific Associate
>>>>>> Universitätsbibliothek Leipzig
>>>>>>
>>>>>> [1] https://stackoverflow.com/questions/39551829/usage-for-profi
>>>>>> le-parameter-for-json-ld-requests
>>>>>> [2] https://json-ld.org/spec/latest/json-ld/#interpreting-json-a
>>>>>> s-json-ld
>>>>>> [3] https://github.com/IIIF/api/issues/1066
>>>>>>
>>>>>> [4] https://github.com/ProfileNegotiation/I-D-Accept--Schema
>>>>>> <https://github.com/ProfileNegotiation/I-D-Accept--Schema%E2%80%8B>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
> --
> Rob Sanderson
> Semantic Architect
> The Getty Trust
> Los Angeles, CA 90049
>
Received on Friday, 27 October 2017 05:00:48 UTC