Re: JSON Schema (json-schema.org) support?

Dear all,

In the IIIF and Shared Canvas we are also trying to use JSON Schema
for our validation of JSON-LD documents.

The challenges in our experience are:
* [JSON-LD] The multitude of ways that URIs and literals can be expressed.
* [JSON Schema] No way to generate warnings, only errors.
* [Implementation] "format" of strings in schema not well supported
* [Implementation] Complex to generate list of tests that pass, rather than fail

The result is at:  http://www.shared-canvas.org/ns/manifest-schema.json

The definitions at the end of the file are likely useful for others,
as they describe how URIs and literals can be represented.

Hope that helps,

Rob

IIIF:   http://www-sul.stanford.edu/iiif/image-api/1.1/#info
Shared Canvas:  http://www.shared-canvas.org/datamodel/iiif/metadata-api.html


On Thu, Aug 15, 2013 at 2:24 PM, David I. Lehn <dil@lehn.org> wrote:
> On Thu, Aug 15, 2013 at 6:54 AM, Edwin Shao <eshao@eshao.es> wrote:
>> It strikes me that JSON-LD and JSON Schema are quite complementary. The
>> first provides context, metadata, and a standard graph traversal mechanism.
>> The second provides a way to describe and validate a given JSON object.
>>
>
> Although they may seem to work well together at first, there are some
> considerable limitations in using JSON Schema as a long term solution
> for JSON-LD description and validation.  Despite this, our PaySwarm
> server code currently uses JSON Schema for validation so I'm familiar
> with the idea and how it can work with some limitations.
>
> The main issue is that JSON Schema describes and validates JSON with a
> known structure.  But JSON-LD is a flexible serialization of graph
> data.  In the general sense, this makes the two somewhat incompatible.
>  There are many ways to serialize JSON-LD data which are all
> equivalent at a low level.  But (any sane use of) JSON Schema only
> works if the data is serialized with a certain structure.  In order to
> properly validate arbitrary JSON-LD data with JSON Schema, you first
> need to make a pass with something like the framing algorithm that is
> a work-in-progress spec.  That would give you a structure that you
> could then validate.  I'm not sure the framing algorithm or code was
> optimized for this sort of use but maybe could be.
>
> If you are, say, getting JSON-LD data via a web service POST call, you
> could document that the JSON-LD data MUST be formatted in a certain
> way and MUST use a certain context in order to be valid.  That is a
> rather unfortunate limitation given how powerful this technology could
> be.  For what it's worth, PaySwarm basically works like that
> currently.
>
> A better solution would be to leverage some of the RDF and OWL schema
> work.  The first step would be to create a proper schema for your
> semantic data using RDF, OWL, or similar.  Then a hypothetical web
> service could take input as JSON-LD (in any structural form), n-quads,
> n3, turtle, rdfa, etc, convert it to a low-level normalized form (such
> as triples), and then run a validator on that data with the semantic
> schema.  This is an interesting approach to take since you are now
> validating the semantic data without concern for its format or
> presented structure.  Once validated, you can run the JSON-LD framing
> algorithm on the data to get it into a known structure that is easy to
> internally process.
>
>
>> ...
>> If there is no canonical way currently (which seems to be the case), I would
>> suggest including one in the upcoming spec, perhaps creating a new @schema
>> keyword.
>>
>
> A keyword seems like overkill for this. As Markus said, there may be
> other better mechanisms.  At one point we were discussing extension
> methods.  Did that get forgotten?  Something like a @context: {@meta:
> {...}} object that you could throw custom key/value pairs into for
> custom processing.  That sort of thing would let you add
> {"http://json-schema.org/schema": "http://exampe.com/foo.json"} as
> metadata for processors that want to support it.  I suppose that could
> just be in the raw data too but might be crufty.
>
> -dave
>

Received on Monday, 19 August 2013 15:45:17 UTC