Re: Forms and principles of the JSOD-LD context from Niklas Lindström on 2011-10-31 (public-linked-json@w3.org from October 2011)

From: Niklas Lindström <lindstream@gmail.com>
Date: Tue, 1 Nov 2011 00:41:46 +0100
To: Gregg Kellogg <gregg@kellogg-assoc.com>
Cc: "public-linked-json@w3.org" <public-linked-json@w3.org>
Message-ID: <CADjV5jczS8b0C+r4kgkzBbYkH6U94KDG=yNiKttwAFUMuJTVsA@mail.gmail.com>
Hi Gregg!

2011/10/31 Gregg Kellogg <gregg@kellogg-assoc.com>:
> On Oct 30, 2011, at 2:51 PM, Niklas Lindström wrote:
>
>> Hi all!
>>
>> In the last telecon we discussed changing @coerce to use the terms as
>> keys. AFAIK this is now agreed upon. This brought on a short
>> discussion about the current form of contexts.
>
> I actually added an example based on the telecom discussion in an issue [1]
>
> {
>  "@context":
>  {
>    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
>    "xsd": "http://www.w3.org/2001/XMLSchema#",
>    "name": "http://xmlns.com/foaf/0.1/name",
>    "age":  {"@iri": "http://xmlns.com/foaf/0.1/age", "@coerce": "xsd:integer"},
>    "homepage": {"@iri": "http://xmlns.com/foaf/0.1/age", "@coerce": "@iri"},
>    "currentProject": {"@iri": "http://xmlns.com/foaf/0.1/currentProject", "@coerce": ["@iri", "@list"]},
>  },
>  ...
> }

Cool! I missed that.


>> As Manu explained, originally the current form was made for brevity,
>> but we now believe that contexts can turn out to be fairly large
>> anyway and will commonly be linked to as external documents. Thus, the
>> case for brevity is lessened.
>>
>> I then went on to explore some options we may consider regarding how
>> contexts work right now. I promised to do some tests and raise this on
>> the mailing list. My delay in sending this has been since I've wavered
>> a lot in evaluating the results. It's a bit tricky to pick principles
>> against which to evaluate the role and form of the context definition.
>>
>> This is thus not so much a suggestion for a change as an attempt at
>> discussing what we want the context to be. I believe that the strife
>> to make things dead simple and as flat as possible are very important
>> design goals. I also think that the current form of JSON-LD contexts
>> adhere to these quite well. But I'd like to illuminate what the
>> options are, and articulate the reasons for (and possibilities of)
>> various forms.
>>
>> We may express some statements about the role and scope of contexts:
>>
>> * It's more important to easily read contexts than to write them.
>
> +1
>
>> * A context has the following roles:
>>  - It maps terms to IRIs (including terms used as prefixes)
>>  - It can map terms to special processing keys (@iri, @type etc.)
>>  - It can define a default term base using @vocab
>>  - It can map a term to a coercion rule (defining how to interpret a value)
>>  - It can define a default language (requiring plain string values to
>> be explicitly coerced)
>
> It can also define the document base with @base.

Very true.

>> So lets look at an alternative to the current context. It is about
>> combining the declaration of the IRI for a term and an optional
>> coercion. Consider this example in the current form of a context
>> (using the new @coerce form mentioned above):
>>
>>    "@context": {
>>        "@vocab": "http://purl.org/dc/terms/",
>>        "label": "http://www.w3.org/2000/01/rdf-schema#label",
>>        "Document": "http://xmlns.com/foaf/0.1/Document",
>>        "primaryTopic": "http://xmlns.com/foaf/0.1/primaryTopic",
>>        "@coerce":  {
>>            "created": "dateTime",
>>            "creator": "@iri",
>>            "identifier": "string"
>>            "issued": "date",
>>            "updated": "dateTime",
>>            "primaryTopic": "@iri"
>>        }
>>    }
>>
>> We could instead use a form where values can either be the IRI string
>> for the term, or an object defining both the @iri (or none to resolve
>> it to @vocab) and a @coerce rule. Like:
>>
>>    "@context": {
>>        "@vocab": "http://purl.org/dc/terms/",
>>        "created": {
>>            "@coerce": "dateTime"
>>        },
>>        "creator": {
>>            "@coerce": "@iri"
>>        },
>>        "identifier": {
>>            "@coerce": "string"
>>        },
>>        "issued": {
>>            "@coerce": "date"
>>        },
>>        "updated": {
>>            "@coerce": "dateTime"
>>        },
>>        "label": "http://www.w3.org/2000/01/rdf-schema#label",
>>        "Document": "http://xmlns.com/foaf/0.1/Document",
>>        "primaryTopic": {
>>            "@iri": "http://xmlns.com/foaf/0.1/primaryTopic",
>>            "@coerce": "@iri"
>>        }
>>    }
>
> Other than the fact that "dateTime", "string", and "date" have no defined meanings, I see value in this. It can be inferred the if there's an @vocab, then prefixes without a defined IRI are considered to be within that vocabulary and IRI expansion takes place.

Yes, I was actually fully aware that I used XSD datatypes without
declaring them. I just forgot to add the xsd prefixes (and xsd
definition) before sending the examples. It stemmed from another thing
I've been thinking about -- that we might consider having some of the
common XSD terms defined by default, or at least predefine the xsd for
prefix use. I've attempted to run without any CURIEs at all though.
But at the same time I didn't want to declare e.g. date as a term,
since I only use it as a datatype and wouldn't want to cause a
conflict with that and e.g. dc:date (albeit I rarely if ever use that
property).

If it were not for the added complexity, I'd argue that @datatype
should have some/all XSD terms predefined *locally*, i.e. only be
available in the @datatype lexical space. Granted that'd also make it
troublesome/impossible to bind those names to anything else (if that'd
be needed))..


>> Now, I see that this is contentious. While it does make it immediate
>> for a reader what both the IRI of and coercion for a term is, it isn't
>> a given that those questions have to be answered at once. The current
>> form (terms used as keys in "@coerce") is possibly superior for a
>> reader only looking for what datatype is used for a term.
>>
>> For the (current) first form above to be better, this can be the principle:
>>
>> * When interpreting a context, users are expected to only look at one
>> aspect at a time.
>>
>> A counter-principle could be:
>>
>> * When reading a context, everything applicable to the term should be
>> immediate at once.
>>
>> Considering that a context defines a term by mapping it to an IRI and
>> potentially a coercion rule, it might be more cohesive to merge the
>> declarations. (Consider also e.g. the case where you want to define
>> two terms with the same IRI but different coercion, such as
>> creatorName for dc:creator as string and creator for dc:creator as
>> @iri.)
>>
>> However, if contexts were to use this richer term definition object,
>> are we on a path to defining a schema language? Are we opening the
>> door for more complexities?
>
> I think the expression is consistent with the rest of JSON-LD; we are adding more layer to this, but in the same character that JSON Objects are used with keywords to augment literals or lists.

That's a good point. And I believe that the context can handle a
little more detail (to a certain extent), as long as it is crystal
clear. Composing all relevant aspects of a term may be a good thing,
instead of splitting them into different parts of the context (both
for maintenance, readability and usage).


>> In support of the change, while not yet
>> addressed, there are also one or two things asked for which do not fit
>> squarely into coerce; like support for inverse terms and declaring if
>> there will be a single value or a set (as a JSON list). For those
>> features, this syntax may be more convenient (albeit not a necessity).
>
> My example (above) accomplishes this by using an array of values after @coerce. I'm not a big fan of inverse terms, but if we were to do this, it could be accomplished in much the same way.

Yes. Coercing things into e.g. @lists of @iri:s (or @dates, etc.) was
another thing we discussed (which I think can be quite important).
That's an issue in itself though. (For one, I was thinking we might
express that with e.g. "@coerce": {"@list": "@iri"}. But let's discuss
that separately.)

I also believe that inverses are very important. Basically all but the
smallest of data set I've come across are graphs where many resources
have *important* inbound arcs (in my project some of them are
crucial). I need a way to explicitly express that in JSON(-LD) with
"inbound" keys (e.g. with a "@rev" object, or possibly better with
terms explicitly representing inverses of properties). But we have a
separate thread about that already. :)


>> In any case I still believe that the extent to which contexts resemble
>> schemas must be limited to the minimum needed to map JSON syntax to
>> RDF abstract syntax. From there on RDFS and OWL can give thorough
>> descriptions of properties and classes. (That means JSON-LD contexts
>> should reasonably not support e.g. the cardinality and syntactic
>> constraints of JSON-schema, nor any of the advanced concepts of
>> schemas, like the logical descriptive features of OWL.)
>>
>> (To get a feel for these forms, I've put a gist with variations of the
>> context for the project I work with at:
>> <https://gist.github.com/1326420>. (The most glaring thing to me there
>> is the repetition of vocabulary bases (in both forms). But that's
>> another issue (possibly solved using CURIEs).))
>
> Also note problems in both these examples: for one, "dateTime" is not defined. Within the JSON-LD spec, it is defined as "http://www.w3.org/2001/XMLSchema#dateTime".
>> Thoughts?
>
> [1] http://json-ld.org/spec/latest/json-ld-syntax/#type-coercion
>
> Thanks for your thoughts. I can see some real advantages in consolidating this information, and it seems to follow given that we're already inclined to reverse the sense of @coerce relations.

Thank you; I agree. Let's continue the discussion and examine how
these matters affect our various use cases and applications.

Best regards,
Niklas
Received on Monday, 31 October 2011 23:47:59 UTC