- From: Dave Longley <dlongley@digitalbazaar.com>
- Date: Wed, 24 Aug 2011 00:03:05 -0400
- To: public-linked-json@w3.org
During the most recent telecon we briefly discussed changing the framing API so that it no longer returns NULL. The reason for doing this seemed to be a general feeling that when NULL is returned it indicates an "error" and errors should be indicated through exceptions instead. It also wasn't very clear to those who haven't yet worked directly with JSON-LD framing what was really being discussed and what the potential issues were. So I decided that I'd send an email out explaining the current state of framing in a little more detail and then talk about the "NULL vs {} issue" from the telecon. Perhaps we can also integrate some of the language here into the spec's explanation on framing. If you have already worked extensively with frames, feel free to skip to the bottom of this email to the telecon issue discussion. JSON-LD Framing There is often more than one way to represent the same directed graph in JSON-LD. The subjects in the graph might be arranged in a flat structure, much like the output the JSON-LD normalization algorithm. Alternatively, the subjects might be expressed in a way that is more natural to many JSON developers, as leaves in a tree. However, there are many different trees that could be constructed to represent the same directed graph. JSON-LD framing allows JSON developers to work more naturally with directed graphs by structuring them in a way that they specify. A JSON-LD frame can be thought of both as a scaffold and as filtering mechanism. When a JSON-LD frame is applied to a JSON-LD document, the resulting output is the content of the JSON-LD document that passed the frame's filters structured in a way that mirrors the way the filters are structured in the frame. A frame can filter content into two ways: strict-typing and duck-typing. A frame that specifies a strict-type filter will only allow subjects from the JSON-LD document that have a @type that matches the filter into the output. A frame that does not specify a strict-type filter will allow any subject that matches the duck-type specified by the filter into the output. For instance: A frame that uses strict-typing: {"@type": "http://example.com/my-type"} This frame will match the first subject found in a JSON-LD document that has the @type "http://example.com/my-type". Note that "the first" is determined by JSON-LD normalization order. To match all subjects with that @type, this frame would be used: [{"@type": "http://example.com/my-type"}] A frame that uses duck-typing: {"http://example.com/my-property": {}} This frame will match the first subject found in a JSON-LD document that has at least the property "http://example.com/my-property". Frames may also include @contexts: { "@context": { "mytype": "http://example.com/my-type" }, "@type": "mytype" } When a frame includes a @context, that same @context will be applied to the output. Now, which subjects will pass through a filter also depends on where in the frame structure the filter occurs. For instance, if we look at the duck-typing example from above, there are actually two filters being used. The first filter works on the JSON-LD document to find a subject with the property "http://example.com/my-property". But the second filter is the empty {}. This filter will cause only the first object for that property to be present in the output. If that filter were instead an array [], then all objects for that property would be present in the output: {"http://example.com/my-property": []} Furthermore, each filter, by default, will "embed" subjects in the output. This is how a tree structure gets specified and built. For instance, if the JSON-LD document that the above frame was applied to was this: [{ "@subject": "http://example.com/subject1", "http://example.com/my-property": {"@iri": "http://example.com/subject2"} }, { "@subject": "http://example.com/subject2", "http://example.com/foo": "42" }] Then the output would be this: { "@subject": "http://example.com/subject1", "http://example.com/my-property": [{ "@subject": "http://example.com/subject2", "http://example.com/foo": "42" }] } Take note that the value of the "http://example.com/my-property" key is still an array. If an array is specified in a frame for a property other than @type, then that property's value will always be an array, even if the output has 0 or 1 matching value. If an array is specified for the @type property, then a subject that contains any of the types in the array will be considered a match for the filter. Hopefully from these examples, one can extrapolate how complex tree structures can be specified via framing. There are some more details and options involved in framing that I'll mention: From the last example you can see that the "http://example.com/foo" property was pulled in for the embedded subject even though it wasn't specified in the frame filter. By default, any properties that are not explicitly mentioned in the frame are included in the output, so long as the subject itself matches the strict-type or duck-type specified. However, this behavior can be modified by using a frame keyword @explicit. If a frame filter has "@explicit" set to true, then when that filter is applied, the output will only include those properties that are explicitly mentioned. Some related behavior, that is worth noting, occurs when a strict-type filter is used that also specifies other properties. In this case, a subject that matches the strict-type will be present in the output, but will contain properties that are set to NULL. This is done so that a developer needs to only check a property for NULL, which is believed to be fairly natural in JSON, rather than checking it for existence. This relates to the issue discussed on the telecon and I will come back to it later. If returning NULL for missing properties is not desired behavior, then value that is returned for missing properties can be modified using the frame keywords: @default and @omitDefault. The @default keyword may be set in a frame filter to a value to return instead of NULL whenever a property is missing. The @omitDefault keyword, when set to true, will simply not include the property in the output. The last option in framing involves the keyword @embed. As I mentioned earlier, by default, subjects will be embedded according to frame filter structure. To change this behavior on a per-filter basis, you set the @embed property to false in a frame filter. This will cause only the @iri of a subject to be used as the object value of a property rather than the full subject and all of its properties. There is also a restriction in the current framing algorithm that requires that subjects only be embedded up to once in an output document, so it is sometimes necessary to specify @embed for complicated structures that reference the same subject in multiple places in the tree. There may be keyword added in the future called @sort. This would be used to sort the objects of a property (when it has more than one). It would specify the property of the objects (if they are subjects) to sort according to and the sort order (ascending or descending). This relates to providing JSON developers a consistent sort order for working with data that isn't a @list. Hopefully this explanation sheds some light on how framing works and what one's expectations should be when crafting a frame to structure your data. --- So, getting back to the telecon issue. As mentioned before, when a property does not exist in a subject that matches a frame filter, that property, by default, is set to a value of NULL in the output. Similarly, if a frame filter of {} is specified for a property, as opposed to [], and no value matches that property, then it will also be set to NULL in the output. This holds true for the "top-level" of an output tree as well as any of its branches. This means that if an object (as opposed to array) frame was applied to a JSON-LD document, and none of the subjects matched the "top-level" filter in the frame, the output would be NULL. It was suggested on the call that we change the output of a "top-level" match of none from NULL to {}. Without considering anything other than top-level matches, I don't think that there's any issue with this. However, when you consider that NULL is returned for non-top-level matches (property matches), then it seems to me that we're being inconsistent (which isn't necessarily a bad thing). Furthermore, if we wanted to be consistent, we should also set properties with no matches to {} -- but this is problematic as it would seem to potentially conflict with properties that have specific ranges. For instance, a property may be only a string or only an integer, and here we've gone and set it to an object. Setting it to NULL instead, IMO, seems to avoid this strangeness. For those who were in support of using {} at the top-level rather than NULL, do you still have the same opinion now that you (perhaps) have a more in-depth view of the JSON-LD framing? What do you think of the non-top-level cases? To be clear, I'm not necessarily opposed to changing the framing API to return {} rather than NULL, but I want to make sure that we're making an informed decision about it; I felt that it was more natural to work with NULL under the circumstances but I may not be in the majority. -Dave -- Dave Longley CTO Digital Bazaar, Inc.
Received on Wednesday, 24 August 2011 04:03:41 UTC