Re: How to represent theories? from Dave Raggett on 2019-11-01 (public-aikr@w3.org from November 2019)

From: Dave Raggett <dsr@w3.org>
Date: Fri, 1 Nov 2019 16:36:25 +0000
To: Amirouche Boubekki <amirouche.boubekki@gmail.com>
Cc: W3C AIKR CG <public-aikr@w3.org>
Message-Id: <BB3BC815-D87F-4C56-894F-2913A29DFC08@w3.org>
For my work on chunks I plan to use what I am calling contexts [1]. Each chunk cites the context it belongs to. A context is itself a chunk allowing for chained contexts that can form trees. Contexts are useful for causal reasoning, reasoning from multiple perspectives, models of other people’s beliefs, representing stories, etc. A chunk is a set of named property values along with a chunk type and identifier. The term comes from psychology and ACT-R, a popular cognitive architecture.

 [1] https://www.w3.org/Data/demos/chunks/chunks.html#contexts <https://www.w3.org/Data/demos/chunks/chunks.html#contexts>

> On 1 Nov 2019, at 14:45, Amirouche Boubekki <amirouche.boubekki@gmail.com> wrote:
> 
> I stumbled upon an interesting problem based on my work on vnstore
> (formerly fstore) that is how to represent several theories made by an
> algorithm in the context of a versioned branch-able database (like
> git).
> 
> Consider for instance a gazetteer based entity-resolution system as
> described in the following question:
> 
> https://stackoverflow.com/q/52046394/140837
> 
> Here is the code:
> 
> input = 'new york is the big apple'.split()
> 
> 
> def spans(lst):
>    if len(lst) == 0:
>        yield None
>    for index in range(1, len(lst)):
>        for span in spans(lst[index:]):
>            if span is not None:
>                yield [lst[0:index]] + span
>    yield [lst]
> 
> knowledgebase = [
>    ['new', 'york'],
>    ['big', 'apple'],
> ]
> 
> out = []
> scores = []
> 
> for span in spans(input):
>    score = 0
>    for candidate in span:
>        for uid, entity in enumerate(knowledgebase):
>            if candidate == entity:
>                score += 1
>    out.append(span)
>    scores.append(score)
> 
> leaderboard = sorted(zip(out, scores), key=lambda x: x[1])
> 
> for winner in leaderboard:
>    print(winner[1], ' ~ ', winner[0])
> 
> The above (naive?) algorithm will guess multiple probable way to link
> a sentence to the knowledge base. With a determinist scoring heuristic
> it will filter many alternatives and for example the following
> alternatives:
> 
>  [['new', 'york'], ['is'], ['the'], ['big', 'apple']]
>  [['new', 'york'], ['is', 'the'], ['big', 'apple']]
> 
> Those are two possible way to link the input sentence "new york is the
> big apple".
> 
> What I want to show is an example where a determinist algorithm can
> not come up with a single result and must keep around "theories"
> downstream and eliminate zero or more theory with another algorithm or
> knowledge acquired later.
> 
> In the versioned nstore (vnstore), one can represent theories using
> branches (as in git) OR using an abstraction on top of the nstore.
> Representing theory in the vnstore will require access to the history
> and branch information along some data to tie together a set of
> theories that are related to a given problem. Whereas theories on top
> of the nstore will require only "some data to tie together a set of
> theories that are related to a given problem" but will require extra
> care to make sure one theory does not leak in another theory.
> 
> Using the nstore approach will mean that there is yet-another
> structure, the structure of alternative theories, on top the nstore
> that is very similar to the vnstore. It gives more freedom but it also
> lead to more complex system.
> 
> It seems to me that the vnstore seems to already solve the idea of
> "alternative theories", as in git, branches are alternative version of
> a software, but it seems like re-using vnstore abstraction for
> theories made by algorithms will lead to more complex code.
> 
> What do you think? How do you handle alternative theories in your work?
> 
> 
> -- 
> Amirouche ~ https://hyper.dev
> 

Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
W3C Data Activity Lead & W3C champion for the Web of things
Received on Friday, 1 November 2019 16:36:30 UTC