Re: Schema Salad

* Peter Amstutz <peter.amstutz@curoverse.com> [2015-09-08 13:31-0400]
> Neat!  I haven't seen much in the way of technologies for strict
> validation for RDF structures so it is useful to see work on this
> problem.  Salad is intended as a higher level source document from
> which concrete schemas are derived, so Shape Expressions could be a
> transformation target to use a salad schema to validate data directly
> as triples.

ShEx has a JSON representation. ShEx compact syntax files
  https://github.com/shexSpec/shexTest/tree/master/schemas
translate the corresponding JSON files:
  https://github.com/shexSpec/shexTest/tree/master/parsedSchemas

I wonder if you could generate JSON from your python code and invoke
an external ShEx validator. I put together a validate scripe for you
to try out:

  npm install shex
  ./node_modules/shex/bin/validate -n http://a.example/s myPersonSchema.shex somePerson.ttl

That says to validate the node <http://a.example/s> in somePerson.ttl
as whatever the start rule is in myPersonSchema.shex. This is not
really ready for release as it gives 0 error messages, but unless you
tell it to be quiet (-q), you'll get output that tells you how the
triples in the data matched the rules in ShEx. I've attached the two
input files myPersonSchema.shex and somePerson.ttl so you have a gentle
start.


> The use case for Salad is aimed at the problem of creating
> document/message formats that are usable as both idiomatic JSON and
> can be interpreted as JSON-LD to yield triples.  This turns out to be
> somewhat tricky since JSON-LD has some quirks and limitations that
> prevent it from being used to mark up totally arbitrary idiomatic
> JSON, so Salad is designed to facilitate a design that ensures the two
> interpretations are consistent, no relevant information is lost during
> conversion, and are the resulting format is convenient for humans to
> read and write.

I note that your work is in python. Perhaps Harold Solbrig's RDFLib
ShEx stuff will be useful to you.


> Thanks,
> Peter
> 
> On Tue, Sep 8, 2015 at 1:04 PM, David Booth <david@dbooth.org> wrote:
> > On 09/07/2015 09:33 PM, Peter Amstutz wrote:
> >>
> >> I wanted to introduce a project I am working on, "Schema Salad":
> >> https://github.com/common-workflow-language/schema_salad
> >
> >
> > Sounds very interesting!   You might want to look at Shape Expressions
> > (ShEx) also, if you haven't seen it:
> > http://www.w3.org/2013/ShEx/Primer
> >
> > ShEx is not JSON-specific, but can validate RDF or even transform it.
> >
> > David Booth
> >
> >
> >>
> >> Salad is a schema language for describing structured linked data
> >> documents in JSON or YAML documents. A Salad schema provides rules for
> >> preprocessing, structural validation, and link checking for documents
> >> described by a Salad schema. Salad builds on JSON-LD and the Apache
> >> Avro data serialization system, and extends Avro with features for
> >> rich data modeling such as inheritance, template specialization,
> >> object identifiers, and object references. Salad was developed to
> >> provide a bridge between the record oriented data modeling supported
> >> by Apache Avro and the Semantic Web.
> >>
> >> The JSON data model is an extremely popular way to represent
> >> structured data. It is attractive because of it's relative simplicity
> >> and is a natural fit with the standard types of many programming
> >> languages. However, this simplicity means that basic JSON lacks
> >> expressive features useful for working with complex data structures
> >> and document formats, such as schemas, object references, and
> >> namespaces.
> >>
> >> JSON-LD is a W3C standard providing a way to describe how to interpret
> >> a JSON document as Linked Data by means of a "context". JSON-LD
> >> provides a powerful solution for representing object references and
> >> namespaces in JSON based on standard web URIs, but is not itself a
> >> schema language. Without a schema providing a well defined structure,
> >> it is difficult to process an arbitrary JSON-LD document as idiomatic
> >> JSON because there are many ways to express the same data that are
> >> logically equivalent but structurally distinct.
> >>
> >> Several schema languages exist for describing and validating JSON
> >> data, such as the Apache Avro data serialization system, however none
> >> understand linked data. As a result, to fully take advantage of
> >> JSON-LD to build the next generation of linked data applications, one
> >> must maintain separate JSON schema, JSON-LD context, RDF schema, and
> >> human documentation, despite significant overlap of content and
> >> obvious need for these documents to stay synchronized.
> >>
> >> Schema Salad is designed to address this gap. It provides a schema
> >> language and processing rules for describing structured JSON content
> >> permitting URI resolution and strict document validation. The schema
> >> language supports linked data through annotations that describe the
> >> linked data interpretation of the content, enables generation of
> >> JSON-LD context and RDF schema, and production of RDF triples by
> >> applying the JSON-LD context. The schema language also provides for
> >> robust support of inline documentation.
> >>
> >> This is a work in progress, and any comments, suggestions, or pointers
> >> to related/similar technologies would be very much appreciated.  Here
> >> are a couple of example schemas:
> >>
> >>
> >> https://github.com/common-workflow-language/schema_salad/blob/master/schema_salad/metaschema.yml
> >>
> >>
> >> https://github.com/common-workflow-language/common-workflow-language/blob/salad_schema/schemas/draft-3/cwl-avro.yml
> >>
> >> Thanks,
> >> Peter
> >>
> >>
> >>
> >>
> >

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.

Received on Saturday, 12 September 2015 06:02:45 UTC