The member extraction algorithm: resolution for issue #71

Hi all,

  * I’ve implemented the Member Extraction Algorithm as discussed during 
the last call 
(https://docs.google.com/presentation/d/13TDYqhyoNTGm0kUPSpgXUfDcbsH_zHK301H2rLk1eQY/edit#slide=id.g2494e1ca6ae_0_0) 
at https://github.com/pietercolpaert/extract-cbd-shape

  * Pull Request 78 (https://github.com/TREEcg/specification/pull/78) on 
the spec is what I’d like to submit to the group for approval during the 
call of the 27th of September at 15:00 CET (call link: 
https://teams.microsoft.com/l/meetup-join/19%3ameeting_NDkwMWZlMzMtM2FjZi00MjZhLTlhZTMtNjAwMjU5Yjc3YWVi%40thread.v2/0?context=%7b%22Tid%22%3a%22a72d5a72-25ee-40f0-9bd1-067cb5b770d4%22%2c%22Oid%22%3a%22074b6191-940e-49de-964e-f2919f3f8501%22%7d 
→ put this in your agenda if it isn’t already)

During the call on the 27th I will present various example cases and how 
the member extraction algorithm will handle them. In the meantime: could 
I already ask you to review the PR? 
https://github.com/TREEcg/specification/pull/78

Kind regards,

Pieter


On 30/08/2023 14:54, Pieter Colpaert wrote:
>
> Hi all,
>
> Today’s meeting gave an ACK on this plan for the Shape Template 
> extraction algorithm (feature 4) and I’ll start implementing and 
> creating a draft spec text for it in PR78. Slides and explanation bellow.
>
> There was also a request for more examples. While implementing this, I 
> will come up with more test cases that can serve as examples of 
> expected behavior.
>
> The next TREE CG meeting will be dedicated to reviewing the Pull 
> Request and looking through these concrete examples and will be held 
> at *27th of September at 15:00 CEST *(same link - put this in your 
> schedules yourself now)
>
> Kind regards,
>
> Pieter
>
> On 29/08/2023 21:16, Pieter Colpaert wrote:
>>
>> Hi all,
>>
>> Tomorrow at 14:00 CEST we meet on this link: 
>> https://teams.microsoft.com/l/meetup-join/19%3ameeting_NDkwMWZlMzMtM2FjZi00MjZhLTlhZTMtNjAwMjU5Yjc3YWVi%40thread.v2/0?context=%7b%22Tid%22%3a%22a72d5a72-25ee-40f0-9bd1-067cb5b770d4%22%2c%22Oid%22%3a%22074b6191-940e-49de-964e-f2919f3f8501%22%7d
>>
>> You can find a link to the slides here: 
>> https://docs.google.com/presentation/d/13TDYqhyoNTGm0kUPSpgXUfDcbsH_zHK301H2rLk1eQY/edit#slide=id.g2494e1ca6ae_0_0
>>
>> It provides functionality for: 0) extracting quads with CBD i) 
>> dereferencing members without quads in the page itself, ii) 
>> dereferencing nodes with quads partially out of the page, iii) 
>> extracting member quads from a named graph, iv) extracting by taking 
>> hints from shape templates.
>>
>> I elaborated most on the last case, as i - iii has not triggered a 
>> lot of controversy and were most clear. Shape Templates now provides 
>> I believe a limited yet powerful set of instructions for more 
>> hierarchical entities.
>>
>> It also provides an answer to questions bellow:
>>
>>  1. I found a better heuristic to handle sh:or and sh:xone
>>
>>  2. Internal identifier for the members: concat(collection IRI, focus 
>> node IRI)
>>
>> I did not yet start work on doing proposals for state bookmarks for 
>> the purpose of resuming.
>>
>> I did not yet adapt the pull request, I first want to get an ACK on 
>> the meeting!
>>
>> Kind regards,
>>
>> Pieter
>>
>> On 22/08/2023 16:18, Pieter Colpaert wrote:
>>>
>>> Hi all,
>>>
>>> We’re still working on the member extraction algorithm. On Wednesday 
>>> the 30th of August we’re continuing the conversation.
>>>
>>> *Would it be possible to take this one at 14:00 instead of 15:00? 
>>> *Please send me a note if this doesn’t work for you and then we 
>>> leave it at 15:00.
>>>
>>> Train of thought for the member extraction algorithm during previous 
>>> meeting:
>>>
>>>  1. Include triples in named graph that equals the tree:member object
>>>
>>>  2. Using CBD on the tree:member object (starshape + recursive blank 
>>> nodes)
>>>
>>>  3. Somehow use the SHACL shape to go deeper than just the 
>>> tree:member object.
>>>
>>> The difficulty with point 3:
>>>
>>> How to deal with SHACL conditionals: do we validate the full SHACL 
>>> conditional in order to know which of the sh:xone for example is the 
>>> one, or do we not validate it, and thus process it as if it’s an 
>>> AND, leading to potentially too many HTTP requests done? Trade-off 
>>> here is performance (we want to avoid unnecessary HTTP calls) vs. 
>>> ease for developers. When choosing the latter, we can of course 
>>> always document that using conditionals with TREE collections is not 
>>> recommended, but then still it would
>>>
>>> Further issues:
>>>
>>>  * How to create an internal identifier for the set of quads that 
>>> were extracted
>>>
>>>  * Standardizing an iterator to indicate how far you processed a 
>>> certain tree:Collection or LDES. This is an LDES issue, but Sander 
>>> mentioned this could probably be generalized to TREE.
>>>
>>> Kind regards,
>>>
>>> Pieter
>>>
>>> -- 
>>> https://pietercolpaert.be/
>>> +32486747122
>> -- 
>> https://pietercolpaert.be/
>> +32486747122
> -- 
> https://pietercolpaert.be/
> +32486747122

-- 
https://pietercolpaert.be/
+32486747122

Received on Wednesday, 6 September 2023 13:51:29 UTC