Re: JSON-LD and JSON-Schema linkage from TJ Koury on 2017-10-25 (public-linked-json@w3.org from October 2017)

From: TJ Koury <tjkoury@gmail.com>
Date: Tue, 24 Oct 2017 23:14:14 -0400
To: Henry Andrews <henry@cloudflare.com>
Cc: Linked JSON <public-linked-json@w3.org>
Message-ID: <CAC6QE-X83FJhtsTPuy5Ki2iKp7wDh5gDx6P8ZOCgvfwxHX=j-w@mail.gmail.com>
Henry,

I most definitely agree with your thoughts, and I think allowing document
fragments or a $schema property to be interpreted as JSON Schema is a huge
step in the right direction.

Honestly I haven't used the validation or hyper schemas due to the lack of
documented interop with Swagger et al, glad to see you think that is an
issue as well.

For now I'll just use the $schema property, but I will try to engage going
forward and provide code examples too.

-TJ

On Oct 24, 2017 10:44 PM, "Henry Andrews" <henry@cloudflare.com> wrote:

TJ,

The HTTP header stuff (either the Link header, or using a media type
parameter in Accept on a request, and getting one back in Content-Type on
the response, or sending one in Content-Type on a request) is really
designed for a simple instance document + schemas paradigm (1-to-many, as
you can have multiple links or multiple schema URIs in the media type
paramter).

What it really does not address well is the case where JSON Schema is
applied to *part* of a document, or where *part* of a document is JSON
Schema, but the entire document is not (at least as we define JSON Schema
right now).  These are problems that I want to focus on for draft-08, which
will have re-use as its theme (although I am intentionally not filing
further issues on them yet to keep folks focused on the draft-07
discussion).

I am trying to build a case for JSON Schema as a media type supporting many
vocabularies.  The terms in the vocabularies are either assertions,
annotations, or both (these terms are defined in the draft-07 Validation
document- they were not called out before).  This is sort of true now:
Validation and Hyper-Schema are both described as "vocabularies", although
their relationship with each other and the core specification is somewhat
muddled.  This lack of clarity makes it hard to define extended,
restricted, or orthogonal vocabularies.

In particular, an extended or restricted vocabulary can only be indicated
by a single "$schema" URI, which makes it impossible to detect the
underlying vocabulary even if an implementation would be able to make use
of that portion of the document.  This would enable graceful degradation.
For example, a validator can still use a hyper-schema, it just ignores the
"base" and "links" keywords and everything works just fine.  However, this
only "works" because implementations hardcode that hyper-schemas are also
usable as validation.  It would not work with a custom extension to
validation, which I would like to fix (I am getting pushback on this, so if
anyone would like this flexibility, I could use your support).

I want this sort of flexibility because a major use case for JSON Schema is
not whole documents but partial documents, on both the schema and instance
side.

API description formats like OpenAPI (a.k.a. Swagger) use JSON Schema (or a
subset of it) in parts of a JSON or YAML file.  It would be great to be
able to consider such a file to itself be a JSON Schema, mostly with a
custom vocabulary, but one that includes a standard vocabulary in certain
spots so that JSON Schema tools could recognize and work with the whole
thing in an interoperable mannter.

Even more relevant to JSON-LD, the W3C's "Web of Things" group's Thing
Description format is a mostly-JSON-LD-based document that uses JSON Schema
in a few spots for structure descriptions.  This sort of mixture seems
useful and desirable.

And now as you mention, there may be data that we want to describe with
JSON Schema (either as its own document or part of something larger)
embedded within documents that we do not wish to fully describe, such as
JSON-LD within an HTML document.  I had not previously considered this
case, but it fits well with the overall direction I want to consider in
draft-08.

Basically, I want to make all of these things first-class use cases for
JSON Schema, such that tools that support "JSON Schema" (whatever that ends
up formally meaning) can work with such situations.

Those of us who work with hyper-schema, web UI annotation, code generation,
document generation, and other non-validation uses tend to support modular
re-usability as I have sketched it out above.  Some of those who focus on
validation only have opposed it (in at least one prominent case,
vehemently).  So I am hoping to build support for a flexible definition of
JSON Schema, as opposed to the "JSON Schema is validation with incidental
other stuff hanging off of it" viewpoint.

This debate will probably kick off towards the end of this month, although
of course folks are welcome to file issues and/or comment before then.
This is the issue that is primarily tracking being able to detect the base
vocabulary of an extended system: https://github.com/json-
schema-org/json-schema-spec/issues/314  The other things do not have issues
yet (I'll file them when draft-07 is out the door later this month)

thanks,
-henry

PS: For those interested in the road map (which is not endorsed by anyone
but me right now):

draft-07 was about hypermedia (including a top-to-bottom rewrite of JSON
Hyper-Schema)
draft-08 will be about re-use
draft-09 will *probably* be about whether "$data" and similar concepts
should be added
after that, I hope we can get JSON Schema, at least core and validation,
adopted by an IETF working group or otherwise start bringing the process to
some sort of resolution.


On Tue, Oct 24, 2017 at 6:25 PM, TJ Koury <tjkoury@gmail.com> wrote:

> Henry,
>
> Thanks for the quick reply.  If you're looking for inputs, and I can
> submit something formal if I'm not too off base here, I'd like to see
> something along the lines of providing an schema property as a child of a
> context node.
>
> The goal would be the ability to make a one to many relationship with a
> linked json being able to be associated with multiple schemas; in my case,
> a NIEM entity instance being described for multiple persistence engines.
>
> I'm of the opinion that using HTTP headers for schema references is not a
> great option for several reasons, such as not being able to accurately
> describe an HTML document with an embedded JSON-LD tag, and also forcing a
> consumer to persist data pertaining to the  model outside of the standard.
> In my specific use case, I'd like the schemas to be canonical when
> referencing authoritative data sources, and not be an implemention detail
> at time of data request.
>
>
> On Oct 24, 2017 8:41 PM, "Henry Andrews" <henry@cloudflare.com> wrote:
>
> Hi TJ,
>   I'm currently the most active editor of the JSON Schema specification,
> and this has been a recent topic of discussion for the forthcoming (no
> later than Nov. 20th, barring unexpected problems) draft-07.
>
>   I see you are using draft-04 for this.  In that draft (and up through
> draft-06) the recommendations were:
>
> * Use HTTP link headers if relevant:  "profile" as an identifier,
> "describedBy" as a locator (most of the time they would be the same)
> * Use "profile" as a media type parameter
>
> However, per the author of the "profile" RFC this usage was never quite
> right (JSON-LD use it correctly, though).  And application/json does not
> support a profile media type parameter anyway.
>
>   In draft-07, we are (probably) proposing replacing "profile" with a
> newly proposed "schema" link relation type and/or media type parameter.
> JSON-LD could opt to support the media type parameter with
> application/ld+json, or just use a "schema" link.  Whether this is correct,
> or sufficient, or if there is a better approach, is something we really
> hope to get feedback on with draft-07.  It is the main unresolved concern
> with the core spec, as far as I know.  And I still find the
> "describedBy"-as-locator part a bit odd, personally.
>
>   If anyone wants to see a preview of draft-07, you can find it here:
> http://json-schema.org/work-in-progress
>
> thanks,
> -henry
>
>
> On Tue, Oct 24, 2017 at 3:55 PM, TJ Koury <tjkoury@gmail.com> wrote:
>
>>
>>
>> ALCON,
>>
>> This has been asked before several times, but the answers always seem to
>> get muddled down and lost in semantics (which is what we’re doing here, so
>> I get it….).
>>
>> For reference:
>>
>> https://github.com/json-ld/json-ld.org/commit/019de59e296c39
>> d7b5c0298d49d95b99fceb294a
>>
>> So there is a JSON-Schema document to validate ANY JSON-LD document
>> against to make sure that it’s a valid JSON-LD document.  Awesome!
>>
>> Now, if I have a http://schema.org/Person, and an associated JSON-Schema
>> document that defines the fields associated with Person, how do I add a URL
>> to the associated JSON-LD document to reference that schema?
>>
>> My specific use case is to generate JSON-LD documents from NIEM
>> instances, then create JSON-Schema documents for each persistence engine
>> (database, file system, etc) that will store the instances, and embed
>> within the JSON-Schema documents themselves the metadata required to create
>> the tables.  Basically an Schema->SQL engine (another ORM!), but entirely
>> based on the JSON-LD and JSON-Schema specs.
>>
>> -TJ
>>
>>
>>
>>
>
>
> --
>
>    -
>
>    *Henry Andrews*  |  Systems Engineer
>    henry@cloudflare.com
>    <https://www.cloudflare.com/>
>
>    1 888 99 FLARE  |  www.cloudflare.com
>    -
>
>
>
>


-- 

   -

   *Henry Andrews*  |  Systems Engineer
   henry@cloudflare.com
   <https://www.cloudflare.com/>

   1 888 99 FLARE  |  www.cloudflare.com
   -
Received on Wednesday, 25 October 2017 03:14:52 UTC