Aligning Data Model requirements with JSON Schema, over-constrained / under-constrained

As we've discussed on our last few calls, one of our near-term goals for
testing is to identify the requirements / features (i.e., the MUSTS and
maybe some of the SHOULDS) of our data model and then craft test files,
basically annotated JSON Schemas for the most part, that can be used to test
the output of Web annotation implementations.  It seems likely that there
will be a few instances where a MUST or SHOULD from the data model cannot be
expressed exactly in JSON Schema. In such cases, given the goals of W3C
Testing, is it acceptable to apply a JSON Schema that slightly
over-constrains? Or is it better to under-constrain? By way of illustration
consider this statement from Section 3.1 of the data model [1]:
 
"The Annotation MUST have 1 or more @context values and
http://www.w3.org/ns/anno.jsonld MUST be one of them. If there is only one
value, then it MUST be provided as a string."
 
Below, is a JSON Schema to verify that an annotation meets these MUSTs. It
is a slight elaboration of Shane's schema on the Spec-Ops WPT
annotation-model github site [2]. However, the schema provided below is
actually slightly more restrictive than the statement of the requirement in
the data model, specifically saying that if the @context used in the
annotation instance is an array, then the first item in the array must be
the URL of the Web Annotation context document (rather than allowing the Web
Annotation context doc URL to appear later in the array's list of items). So
example A (Example 1 from the data model) validates, example B also
validates, but example C fails.
 
Example A:
{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "id": "http://example.org/anno1",
  "type": "Annotation",
  "body": "http://example.org/post1",
  "target": "http://example.com/page1"
}
 
Example B: 
{
  "@context": [
       "http://www.w3.org/ns/anno.jsonld" ,
       "http://schema.org",
       {"lastName": "foaf:familyName",
        "firstName": "foaf:givenName"}       
   ],
  "id": "http://example.org/anno1",
  "type": "Annotation",
  "body": "http://example.org/post1",
  "target": "http://example.com/page1"
}
 
Example C: 
{
  "@context": [
      "http://schema.org", 
      "http://www.w3.org/ns/anno.jsonld" ,
       {"lastName": "foaf:familyName",
        "firstName": "foaf:givenName"}       
   ],
  "id": "http://example.org/anno1",
  "type": "Annotation",
  "body": "http://example.org/post1",
  "target": "http://example.com/page1"
}
 
The Schema:
{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "@context": "https://www.w3.org/ns/JSONtest-v1.jsonld",
    "name": "Verify annotation includes required @context",
    "description": "The Annotation MUST have 1 or more @context values and
http://www.w3.org/ns/anno.jsonld MUST be one of them. If there is only one
value, then it MUST be provided as a string.",
    "ref": "https://www.w3.org/TR/annotation-model/#annotations", 
    "testType": "manual",
 
    "type": "object",
    "properties": {"@context": {
        "anyOf": [
            {
                "type": "string",
                "enum": ["http://www.w3.org/ns/anno.jsonld"]
            },
            {
                "type": "array",
                "minItems": 1,
                "items": [{
                    "type": "string",
                    "enum": ["http://www.w3.org/ns/anno.jsonld"]
                }]
            }
        ],
        "not": {"type": "object"}
    }},
    "required": ["@context"]
}
 
Benjamin or Shane or somebody more expert in JSON Schema probably can come
up with a way I missed to express the proper constraint on an @context that
is an array in order to allow the anno.jsonld URI to appear in any order
within the @context array (I would appreciate knowing),  but even if this is
not an ideal illustration, I think the question is still fair to ask. What
happens when there's less than perfect alignment between what can be
expressed in JSON Schema and what we say in the data model? 
 
Another place where we might see this is in data types.  So for example, in
the data model we specify that the range of created (dcterms:created), if
you use it, 'MUST be expressed according to the W3C Datetime Format.' JSON
Schema supports semantic validation of date-time representations that
conform to RFC 3339, Section 5.6. As I recall, every date string that
conforms to RFC 3339, also meets the requirements of W3C DTF, but there may
be some string values that would meet W3C DTF, but would not meet RFC 3339.
We could presumably come up with a regEx that would map more exactly to
W3CDTF and use that within JSON Schema, but Is it good enough in the
interest of time and to avoid mistakes, just to write our schema for testing
to check against RFC 3339 which may be  a little bit more restrictive than
W3C DTF?  
 
[1] https://www.w3.org/TR/annotation-model/#annotations  
[2]
https://github.com/Spec-Ops/web-platform-tests/blob/master/annotation-model/
framework/annotations/verify-context-present.json 
 
Thanks,
 
Tim Cole
University of Illinois at UC
 
 
 

Received on Friday, 6 May 2016 22:59:49 UTC