Re: a question about framing test 14 from Dave Longley on 2016-04-05 (public-linked-json@w3.org from April 2016)

From: Dave Longley <dlongley@digitalbazaar.com>
Date: Tue, 5 Apr 2016 14:51:14 -0400
To: james anderson <james@dydra.com>, Linked JSON <public-linked-json@w3.org>
Message-ID: <57040922.2060207@digitalbazaar.com>
On 03/30/2016 05:23 PM, james anderson wrote:
>
>> On 2016-03-30, at 18:42, Dave Longley <dlongley@digitalbazaar.com
>> <mailto:dlongley@digitalbazaar.com>> wrote:
>>
>> On 03/30/2016 12:07 PM, james anderson wrote:
>>> good afternoon;
>>>
>>>>
>>>> The framing spec is sorely out-of-date and does not include
>>>> options like those specified in that issue.
>>>
>>> this reads as if one has yet to decide, what the framing
>>> algorithm is intended to do.
>>
>> That is correct. The behavior has not yet been formally
>> standardized, though most (if not all) implementations currently
>> have the same behavior. It is a work in progress.
>
> that my be a step forward, but it does not seem plausible approach to
>  refer users to several implementations and say, well, try to figure
> out what they do.

Sorry -- no one has had any spare time to update the specification and
there really aren't any other options to point people at. Writing any
other sort of documentation to explain it would take just as much time
as updating the spec. I realize the situation is clearly not optimal,
but these specs are all worked on on a volunteer basis and many of the
people who work on them are currently engaged in several other related
efforts. :)

The spec is up on github -- PRs are welcome!

>
>>
>>>
>>> is there some way to understand the comments to that issue as an
>>> answer to the question about how nested embedding is to work
>>> other than “by default"? i see the options, but the intended
>>> effect is not clear.
>>
>> You may specify these embed options within your frame or as
>> defaults in the API:
>>
>> "@embed": "@always" - For a node that matches the frame filter,
>> every single one of its related nodes will be embedded (or nested)
>> within that node -- and this nesting behavior will recurse through
>> those related nodes until no other related nodes are found or until
>> a cycle is encountered in the graph.
>
> this independent of any type constraint which may have been expressed
> in a nested frame?

Hopefully this was explained by my other mail.

>
>
>> Note that this may cause data duplication in the output.
>
> why? does this mean it should appear both at all embedded locations
> and at the top level or at the first embedded location and at the top
> level, or at all embedded locations and not at the top level?

It means that whenever there is a node that matches a frame or a
subframe, that node will be embedded at the corresponding location in
the output tree. This can cause duplications in the output.

For example, with this frame:

{
   "@context": {
     "ex:foo": {"@container": "@list"}
   },
   "ex:always": {"@embed": "@always"}
}

And this input:

{
   "@context": {
     "ex:foo": {"@container": "@list"}
   },
   "@graph": [{
     "@id": "_:b1",
     "ex:always": {"@id": "_:b2"}
   }, {
     "@id": "_:b2",
     "ex:foo": "bar"
   }, {
     "@id": "_:b3",
     "ex:always": {"@id": "_:b2"}
   }]
}

The output is:

{
   "@context": {
     "ex:foo": {"@container": "@list"}
   },
   "@graph": [{
     "@id": "_:b0",
     "ex:always": {
       "@id": "_:b1",
       "ex:foo": ["bar"]
     }
   }, {
     "@id": "_:b2",
     "ex:always": {
       "@id": "_:b1",
       "ex:foo": ["bar"]
     }
   }]
}

Which has these quads:

_:b0 <ex:always> _:b1 .
_:b1 <ex:foo> _:b3 .
_:b1 <ex:foo> _:b4 .
_:b2 <ex:always> _:b1 .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "bar" .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "bar" .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .

You can see the duplicated list there.


>
>>
>> "@embed": "@last" - This behavior will match "@always" except that
>> a list of references to previously embedded nodes will be
>> maintained. When a embedding/nesting operation occurs, that list
>> will be checked to see if the related node has already been
>> embedded/nested. If it has, that embedded node is moved to the new
>> location and, at its old location, replaced with only a reference
>> to its subject. In this way, no data duplication occurs and the
>> "last" position the algorithm touches will be where the embed
>> occurs. This is the default embed behavior because it does not
>> change the data.
>
> this would appear to restrict the operation to an in-memory
> manipulation and would not be feasible for a large stream.

Yes.

>
>>
>> "@embed": "@never" - This behavior will always cause related nodes
>> to be referenced by their subject only, never embedded/nested.
>
> you mean, to appear as top-level nodes only. i thought they are
> always referenced by subject, no matter where they would appear.

Yes. To be clear, this is what I mean:

With this frame:

{
   "ex:never": {"@embed": "@never"}
}

And this input:

{
   "@id": "_:b1",
   "ex:never": {
     "@id": "_:b2",
     "ex:foo": "abc"
   }
}

The output is:

{
   "@graph": [{
     "@id": "_:b0",
     "ex:never": {
       "@id": "_:b1"
     }]
}

You can see the "ex:foo" property has been dropped.


>
>>
>> "@embed": "@link" - This behavior will do the same thing that
>> "@always" does, however, without causing any data duplication.
>> Instead, an in-memory reference will be used whenever an
>> embedded/nested node appears more than once.
>
> what dose this mean if the process is to encode the data to a stream
>  rather than to reconstruct an in-memory model.

The framing algorithms are presently designed for in-memory processing.
More work needs to be done for streaming.


-- 
Dave Longley
CTO
Digital Bazaar, Inc.
http://digitalbazaar.com
Received on Tuesday, 5 April 2016 18:51:37 UTC