- From: Dave Longley <dlongley@digitalbazaar.com>
- Date: Wed, 24 Aug 2011 00:03:05 -0400
- To: public-linked-json@w3.org
During the most recent telecon we briefly discussed changing the framing
API so that it no longer returns NULL. The reason for doing this seemed
to be a general feeling that when NULL is returned it indicates an
"error" and errors should be indicated through exceptions instead. It
also wasn't very clear to those who haven't yet worked directly with
JSON-LD framing what was really being discussed and what the potential
issues were. So I decided that I'd send an email out explaining the
current state of framing in a little more detail and then talk about the
"NULL vs {} issue" from the telecon. Perhaps we can also integrate some
of the language here into the spec's explanation on framing. If you have
already worked extensively with frames, feel free to skip to the bottom
of this email to the telecon issue discussion.
JSON-LD Framing
There is often more than one way to represent the same directed graph in
JSON-LD. The subjects in the graph might be arranged in a flat
structure, much like the output the JSON-LD normalization algorithm.
Alternatively, the subjects might be expressed in a way that is more
natural to many JSON developers, as leaves in a tree. However, there are
many different trees that could be constructed to represent the same
directed graph. JSON-LD framing allows JSON developers to work more
naturally with directed graphs by structuring them in a way that they
specify.
A JSON-LD frame can be thought of both as a scaffold and as filtering
mechanism. When a JSON-LD frame is applied to a JSON-LD document, the
resulting output is the content of the JSON-LD document that passed the
frame's filters structured in a way that mirrors the way the filters are
structured in the frame.
A frame can filter content into two ways: strict-typing and duck-typing.
A frame that specifies a strict-type filter will only allow subjects
from the JSON-LD document that have a @type that matches the filter into
the output. A frame that does not specify a strict-type filter will
allow any subject that matches the duck-type specified by the filter
into the output. For instance:
A frame that uses strict-typing:
{"@type": "http://example.com/my-type"}
This frame will match the first subject found in a JSON-LD document that
has the @type "http://example.com/my-type". Note that "the first" is
determined by JSON-LD normalization order. To match all subjects with
that @type, this frame would be used:
[{"@type": "http://example.com/my-type"}]
A frame that uses duck-typing:
{"http://example.com/my-property": {}}
This frame will match the first subject found in a JSON-LD document that
has at least the property "http://example.com/my-property".
Frames may also include @contexts:
{
"@context": { "mytype": "http://example.com/my-type" },
"@type": "mytype"
}
When a frame includes a @context, that same @context will be applied to
the output.
Now, which subjects will pass through a filter also depends on where in
the frame structure the filter occurs. For instance, if we look at the
duck-typing example from above, there are actually two filters being
used. The first filter works on the JSON-LD document to find a subject
with the property "http://example.com/my-property". But the second
filter is the empty {}. This filter will cause only the first object for
that property to be present in the output. If that filter were instead
an array [], then all objects for that property would be present in the
output:
{"http://example.com/my-property": []}
Furthermore, each filter, by default, will "embed" subjects in the
output. This is how a tree structure gets specified and built. For
instance, if the JSON-LD document that the above frame was applied to
was this:
[{
"@subject": "http://example.com/subject1",
"http://example.com/my-property": {"@iri": "http://example.com/subject2"}
},
{
"@subject": "http://example.com/subject2",
"http://example.com/foo": "42"
}]
Then the output would be this:
{
"@subject": "http://example.com/subject1",
"http://example.com/my-property": [{
"@subject": "http://example.com/subject2",
"http://example.com/foo": "42"
}]
}
Take note that the value of the "http://example.com/my-property" key is
still an array. If an array is specified in a frame for a property other
than @type, then that property's value will always be an array, even if
the output has 0 or 1 matching value. If an array is specified for the
@type property, then a subject that contains any of the types in the
array will be considered a match for the filter.
Hopefully from these examples, one can extrapolate how complex tree
structures can be specified via framing. There are some more details and
options involved in framing that I'll mention:
From the last example you can see that the "http://example.com/foo"
property was pulled in for the embedded subject even though it wasn't
specified in the frame filter. By default, any properties that are not
explicitly mentioned in the frame are included in the output, so long as
the subject itself matches the strict-type or duck-type specified.
However, this behavior can be modified by using a frame keyword
@explicit. If a frame filter has "@explicit" set to true, then when that
filter is applied, the output will only include those properties that
are explicitly mentioned.
Some related behavior, that is worth noting, occurs when a strict-type
filter is used that also specifies other properties. In this case, a
subject that matches the strict-type will be present in the output, but
will contain properties that are set to NULL. This is done so that a
developer needs to only check a property for NULL, which is believed to
be fairly natural in JSON, rather than checking it for existence. This
relates to the issue discussed on the telecon and I will come back to it
later.
If returning NULL for missing properties is not desired behavior, then
value that is returned for missing properties can be modified using the
frame keywords: @default and @omitDefault. The @default keyword may be
set in a frame filter to a value to return instead of NULL whenever a
property is missing. The @omitDefault keyword, when set to true, will
simply not include the property in the output.
The last option in framing involves the keyword @embed. As I mentioned
earlier, by default, subjects will be embedded according to frame filter
structure. To change this behavior on a per-filter basis, you set the
@embed property to false in a frame filter. This will cause only the
@iri of a subject to be used as the object value of a property rather
than the full subject and all of its properties. There is also a
restriction in the current framing algorithm that requires that subjects
only be embedded up to once in an output document, so it is sometimes
necessary to specify @embed for complicated structures that reference
the same subject in multiple places in the tree.
There may be keyword added in the future called @sort. This would be
used to sort the objects of a property (when it has more than one). It
would specify the property of the objects (if they are subjects) to sort
according to and the sort order (ascending or descending). This relates
to providing JSON developers a consistent sort order for working with
data that isn't a @list.
Hopefully this explanation sheds some light on how framing works and
what one's expectations should be when crafting a frame to structure
your data.
---
So, getting back to the telecon issue.
As mentioned before, when a property does not exist in a subject that
matches a frame filter, that property, by default, is set to a value of
NULL in the output. Similarly, if a frame filter of {} is specified for
a property, as opposed to [], and no value matches that property, then
it will also be set to NULL in the output. This holds true for the
"top-level" of an output tree as well as any of its branches. This means
that if an object (as opposed to array) frame was applied to a JSON-LD
document, and none of the subjects matched the "top-level" filter in the
frame, the output would be NULL.
It was suggested on the call that we change the output of a "top-level"
match of none from NULL to {}. Without considering anything other than
top-level matches, I don't think that there's any issue with this.
However, when you consider that NULL is returned for non-top-level
matches (property matches), then it seems to me that we're being
inconsistent (which isn't necessarily a bad thing). Furthermore, if we
wanted to be consistent, we should also set properties with no matches
to {} -- but this is problematic as it would seem to potentially
conflict with properties that have specific ranges. For instance, a
property may be only a string or only an integer, and here we've gone
and set it to an object. Setting it to NULL instead, IMO, seems to avoid
this strangeness.
For those who were in support of using {} at the top-level rather than
NULL, do you still have the same opinion now that you (perhaps) have a
more in-depth view of the JSON-LD framing? What do you think of the
non-top-level cases?
To be clear, I'm not necessarily opposed to changing the framing API to
return {} rather than NULL, but I want to make sure that we're making an
informed decision about it; I felt that it was more natural to work with
NULL under the circumstances but I may not be in the majority.
-Dave
--
Dave Longley
CTO
Digital Bazaar, Inc.
Received on Wednesday, 24 August 2011 04:03:41 UTC