- From: Dave Raggett <dsr@w3.org>
- Date: Fri, 1 Nov 2019 16:36:25 +0000
- To: Amirouche Boubekki <amirouche.boubekki@gmail.com>
- Cc: W3C AIKR CG <public-aikr@w3.org>
- Message-Id: <BB3BC815-D87F-4C56-894F-2913A29DFC08@w3.org>
For my work on chunks I plan to use what I am calling contexts [1]. Each chunk cites the context it belongs to. A context is itself a chunk allowing for chained contexts that can form trees. Contexts are useful for causal reasoning, reasoning from multiple perspectives, models of other people’s beliefs, representing stories, etc. A chunk is a set of named property values along with a chunk type and identifier. The term comes from psychology and ACT-R, a popular cognitive architecture. [1] https://www.w3.org/Data/demos/chunks/chunks.html#contexts <https://www.w3.org/Data/demos/chunks/chunks.html#contexts> > On 1 Nov 2019, at 14:45, Amirouche Boubekki <amirouche.boubekki@gmail.com> wrote: > > I stumbled upon an interesting problem based on my work on vnstore > (formerly fstore) that is how to represent several theories made by an > algorithm in the context of a versioned branch-able database (like > git). > > Consider for instance a gazetteer based entity-resolution system as > described in the following question: > > https://stackoverflow.com/q/52046394/140837 > > Here is the code: > > input = 'new york is the big apple'.split() > > > def spans(lst): > if len(lst) == 0: > yield None > for index in range(1, len(lst)): > for span in spans(lst[index:]): > if span is not None: > yield [lst[0:index]] + span > yield [lst] > > knowledgebase = [ > ['new', 'york'], > ['big', 'apple'], > ] > > out = [] > scores = [] > > for span in spans(input): > score = 0 > for candidate in span: > for uid, entity in enumerate(knowledgebase): > if candidate == entity: > score += 1 > out.append(span) > scores.append(score) > > leaderboard = sorted(zip(out, scores), key=lambda x: x[1]) > > for winner in leaderboard: > print(winner[1], ' ~ ', winner[0]) > > The above (naive?) algorithm will guess multiple probable way to link > a sentence to the knowledge base. With a determinist scoring heuristic > it will filter many alternatives and for example the following > alternatives: > > [['new', 'york'], ['is'], ['the'], ['big', 'apple']] > [['new', 'york'], ['is', 'the'], ['big', 'apple']] > > Those are two possible way to link the input sentence "new york is the > big apple". > > What I want to show is an example where a determinist algorithm can > not come up with a single result and must keep around "theories" > downstream and eliminate zero or more theory with another algorithm or > knowledge acquired later. > > In the versioned nstore (vnstore), one can represent theories using > branches (as in git) OR using an abstraction on top of the nstore. > Representing theory in the vnstore will require access to the history > and branch information along some data to tie together a set of > theories that are related to a given problem. Whereas theories on top > of the nstore will require only "some data to tie together a set of > theories that are related to a given problem" but will require extra > care to make sure one theory does not leak in another theory. > > Using the nstore approach will mean that there is yet-another > structure, the structure of alternative theories, on top the nstore > that is very similar to the vnstore. It gives more freedom but it also > lead to more complex system. > > It seems to me that the vnstore seems to already solve the idea of > "alternative theories", as in git, branches are alternative version of > a software, but it seems like re-using vnstore abstraction for > theories made by algorithms will lead to more complex code. > > What do you think? How do you handle alternative theories in your work? > > > -- > Amirouche ~ https://hyper.dev > Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett W3C Data Activity Lead & W3C champion for the Web of things
Received on Friday, 1 November 2019 16:36:30 UTC