Re: How to represent theories?

Amirouche and all
thanks for sharing
from a knowledge/data modelling viewpoint-
the essential KR aspect of the theory should state (isolate) the key
elements/factors
of the theory itself (think enttty relationship diagram as an example)
I have not had the time to study your problem/example in detail, but it
would be
good to use your example as an exercise that we could do and document
If you talk about abstraction, then ny should be represented by a
class/category
[city]  and big apple as [nickname].This abstraction is essential for me
and I actually
do not see it in your code, unless the variables in your example are meant
to be
as example of parameters passed into a structure that is not visible in
your example
Also, I would separate the code (the syntactic structures nd their naming
conventions, such as
(I guess)  span, index etc -
from the content /knowledge  of the entity itself and of the
process/reasoning
relating to the entity  (procedureal vs declarative)

When it comes to leatherboard - the latter part of your example - I dont
understand
how it relates to the declarative part of ny is big apple. But maybe I have
not looked
up your references .

Thats the kind of KR I concern myself with

Not sure if this is what you are looking for, if useful, let's use your
example as a use case

Paola DM

On Fri, Nov 1, 2019 at 10:47 PM Amirouche Boubekki <
amirouche.boubekki@gmail.com> wrote:

> I stumbled upon an interesting problem based on my work on vnstore
> (formerly fstore) that is how to represent several theories made by an
> algorithm in the context of a versioned branch-able database (like
> git).
>
> Consider for instance a gazetteer based entity-resolution system as
> described in the following question:
>
> https://stackoverflow.com/q/52046394/140837
>
> Here is the code:
>
> input = 'new york is the big apple'.split()
>
>
> def spans(lst):
>     if len(lst) == 0:
>         yield None
>     for index in range(1, len(lst)):
>         for span in spans(lst[index:]):
>             if span is not None:
>                 yield [lst[0:index]] + span
>     yield [lst]
>
> knowledgebase = [
>     ['new', 'york'],
>     ['big', 'apple'],
> ]
>
> out = []
> scores = []
>
> for span in spans(input):
>     score = 0
>     for candidate in span:
>         for uid, entity in enumerate(knowledgebase):
>             if candidate == entity:
>                 score += 1
>     out.append(span)
>     scores.append(score)
>
> leaderboard = sorted(zip(out, scores), key=lambda x: x[1])
>
> for winner in leaderboard:
>     print(winner[1], ' ~ ', winner[0])
>
> The above (naive?) algorithm will guess multiple probable way to link
> a sentence to the knowledge base. With a determinist scoring heuristic
> it will filter many alternatives and for example the following
> alternatives:
>
>   [['new', 'york'], ['is'], ['the'], ['big', 'apple']]
>   [['new', 'york'], ['is', 'the'], ['big', 'apple']]
>
> Those are two possible way to link the input sentence "new york is the
> big apple".
>
> What I want to show is an example where a determinist algorithm can
> not come up with a single result and must keep around "theories"
> downstream and eliminate zero or more theory with another algorithm or
> knowledge acquired later.
>
> In the versioned nstore (vnstore), one can represent theories using
> branches (as in git) OR using an abstraction on top of the nstore.
> Representing theory in the vnstore will require access to the history
> and branch information along some data to tie together a set of
> theories that are related to a given problem. Whereas theories on top
> of the nstore will require only "some data to tie together a set of
> theories that are related to a given problem" but will require extra
> care to make sure one theory does not leak in another theory.
>
> Using the nstore approach will mean that there is yet-another
> structure, the structure of alternative theories, on top the nstore
> that is very similar to the vnstore. It gives more freedom but it also
> lead to more complex system.
>
> It seems to me that the vnstore seems to already solve the idea of
> "alternative theories", as in git, branches are alternative version of
> a software, but it seems like re-using vnstore abstraction for
> theories made by algorithms will lead to more complex code.
>
> What do you think? How do you handle alternative theories in your work?
>
>
> --
> Amirouche ~ https://hyper.dev
>
>

Received on Saturday, 2 November 2019 02:19:11 UTC