W3C home > Mailing lists > Public > public-linked-json@w3.org > September 2015

Re: Schema Salad

From: Peter Amstutz <peter.amstutz@curoverse.com>
Date: Tue, 8 Sep 2015 13:31:29 -0400
Message-ID: <CAEXjzRvvJXvdEpia4cSEopVRBHi+_LiPDHV-1JPzTva0HpQCew@mail.gmail.com>
To: David Booth <david@dbooth.org>
Cc: public-linked-json@w3.org, "Eric Prud'hommeaux" <eric@w3.org>
Neat!  I haven't seen much in the way of technologies for strict
validation for RDF structures so it is useful to see work on this
problem.  Salad is intended as a higher level source document from
which concrete schemas are derived, so Shape Expressions could be a
transformation target to use a salad schema to validate data directly
as triples.

The use case for Salad is aimed at the problem of creating
document/message formats that are usable as both idiomatic JSON and
can be interpreted as JSON-LD to yield triples.  This turns out to be
somewhat tricky since JSON-LD has some quirks and limitations that
prevent it from being used to mark up totally arbitrary idiomatic
JSON, so Salad is designed to facilitate a design that ensures the two
interpretations are consistent, no relevant information is lost during
conversion, and are the resulting format is convenient for humans to
read and write.


On Tue, Sep 8, 2015 at 1:04 PM, David Booth <david@dbooth.org> wrote:
> On 09/07/2015 09:33 PM, Peter Amstutz wrote:
>> I wanted to introduce a project I am working on, "Schema Salad":
>> https://github.com/common-workflow-language/schema_salad
> Sounds very interesting!   You might want to look at Shape Expressions
> (ShEx) also, if you haven't seen it:
> http://www.w3.org/2013/ShEx/Primer
> ShEx is not JSON-specific, but can validate RDF or even transform it.
> David Booth
>> Salad is a schema language for describing structured linked data
>> documents in JSON or YAML documents. A Salad schema provides rules for
>> preprocessing, structural validation, and link checking for documents
>> described by a Salad schema. Salad builds on JSON-LD and the Apache
>> Avro data serialization system, and extends Avro with features for
>> rich data modeling such as inheritance, template specialization,
>> object identifiers, and object references. Salad was developed to
>> provide a bridge between the record oriented data modeling supported
>> by Apache Avro and the Semantic Web.
>> The JSON data model is an extremely popular way to represent
>> structured data. It is attractive because of it's relative simplicity
>> and is a natural fit with the standard types of many programming
>> languages. However, this simplicity means that basic JSON lacks
>> expressive features useful for working with complex data structures
>> and document formats, such as schemas, object references, and
>> namespaces.
>> JSON-LD is a W3C standard providing a way to describe how to interpret
>> a JSON document as Linked Data by means of a "context". JSON-LD
>> provides a powerful solution for representing object references and
>> namespaces in JSON based on standard web URIs, but is not itself a
>> schema language. Without a schema providing a well defined structure,
>> it is difficult to process an arbitrary JSON-LD document as idiomatic
>> JSON because there are many ways to express the same data that are
>> logically equivalent but structurally distinct.
>> Several schema languages exist for describing and validating JSON
>> data, such as the Apache Avro data serialization system, however none
>> understand linked data. As a result, to fully take advantage of
>> JSON-LD to build the next generation of linked data applications, one
>> must maintain separate JSON schema, JSON-LD context, RDF schema, and
>> human documentation, despite significant overlap of content and
>> obvious need for these documents to stay synchronized.
>> Schema Salad is designed to address this gap. It provides a schema
>> language and processing rules for describing structured JSON content
>> permitting URI resolution and strict document validation. The schema
>> language supports linked data through annotations that describe the
>> linked data interpretation of the content, enables generation of
>> JSON-LD context and RDF schema, and production of RDF triples by
>> applying the JSON-LD context. The schema language also provides for
>> robust support of inline documentation.
>> This is a work in progress, and any comments, suggestions, or pointers
>> to related/similar technologies would be very much appreciated.  Here
>> are a couple of example schemas:
>> https://github.com/common-workflow-language/schema_salad/blob/master/schema_salad/metaschema.yml
>> https://github.com/common-workflow-language/common-workflow-language/blob/salad_schema/schemas/draft-3/cwl-avro.yml
>> Thanks,
>> Peter
Received on Tuesday, 8 September 2015 17:31:58 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:18:46 UTC