Re: Knowledge graphs from StratML from Paola Di Maio on 2020-03-16 (public-aikr@w3.org from March 2020)

From: Paola Di Maio <paola.dimaio@gmail.com>
Date: Mon, 16 Mar 2020 09:11:18 +0800
To: Paul Alagna <pjalagna@gmail.com>
Cc: W3C AIKR CG <public-aikr@w3.org>
Message-ID: <CAMXe=Sqap2=gcb-sdhF4TqqZPTdcHxQy5vFKDN=HV7-oBooKag@mail.gmail.com>
Thanks Paul
scrolled briefly through your long email,  before I attempt to study it
please state why do you think this is your task, what are the expected
outcomes of you propose to do, and what benefits purpose, and how does it
fit into the overall
plan/mission of the group. Which one of your proposed goals does this
activity fit into
(or if compelling, we could add additional goal )
thanks!
PDM

On Mon, Mar 16, 2020 at 8:45 AM Paul Alagna <pjalagna@gmail.com> wrote:

> All;
>     My task as i see it is to create AI entries from a StratML XML report.
>     So you know I started playing. My attack is as follows:
> 1- convert the hierarchical XML into an RDF pile.
>
> 2- convert the RDF pile into a strongly serialized fact-reaction data set
>
> 3- feed the ANN with the data set
>
> currently I'm stuck here at 1...
> 1- convert the hierarchical XML into an RDF pile:
>     I am running on the assumption that a strongly strongly serialized
> fact-reaction data set like that derived from an RDF graph is the most
> concise way to feed an AI neural net. I wrote a python program to create an
> RDF pile.  Some problems there. The structure seems to be as predicted. the
> meta names are focused on the needs of the schema and the meta items are
> related according to the schema's intention. but there is a lot of
> ambiguity amongst the element names. My first attempt at disambiguation was
> to use an address marker. The hierarchy implies an address marker (where in
> the descending parse did we pick this element up). Using that address along
> with the name disambiguates it.
>     This is part of the report..StrategicPlan@1..Name@2=>((Use Cases for
> the StratML Standard))
>
> ..StrategicPlan@1..Description@3=>((None))
>
> ..StrategicPlan@1..OtherInformation@4=>((None))
>         Description@3 is different from ..StrategicPlan@1.
> .StrategicPlanCore@5..Organization@6..Description@10=>((None))
> Description@10
>         In order to mesh graphs or transfer graphs via a PI network the
> addresses need to remain constant. And that will change by adding fields to
> the XML. Some sort of GUID is needed and I wonder if that consistency was
> the intent of the identifier element. But its not used uniformly.
>
> in dissecting a StratML example, to replicate it into an RDF graph, i
> found that the chain of meta tags that appeared to have the same
> syntactical provenance did not reflect the logical relationships (peerage)
> of the elements.
> Some elements were peers and some were not. the hierarchy nor the
> provenance provided enough information to discern which items were peer
> related (IE of the same chain) and which items were not.
> for instance:
> [StrategicPlan=>StrategicPlanCore=>Goal] appears more than 700 times.
>
> As a data purest this upsets me to no end.
>
> my discussion is as follows.
>
> key
> leads to  is tokenized as =>
>
> equality set of chains:
> given 2 chains
> 1=>2=>3 ; 1=>2=>3
> OR these 2:
> customer=>order; customer=>order
> these chains are considered equal if and only if each named node in each
> chain is [equal in every way]*1 to its counterpart on the other chain.
>
> [equal in every way]*1: given a profile of an element (attributes, values,
> formats, position in syntax, etc.) these elements are equal if they have
> the same profile. in data science we call this profile the elements
> signature.
>
> peer chain:
> given 2 chains
> 1=>2=>3=>4 ; 1=>2=>3=>5
> 4 and 5 are peers if and only if nodes 1,2,3 form an equality set of
> chains.
>
> there are exceptions
> for example:
> customer=>order=>lineItem=>part=corn;
> customer=>order=>lineItem=>part=rice;
> the parts(corn and rice) are peers if and only if
> the customer in both chains is the same customer AND
> the order in both chains is the same.
> Oddly the lineItem does not have to be the same (and would not, in the
> same order, be the same).
> In data science elements like "lineItem" are considered constructs to
> sequence or differentiate elements under a common "key". they offer no
> further business intelligence. for instance, that corn is at lineItem 1
> offers no additional business knowledge about the order or the customer.
>
> Homonym chains:
> given 2 chains
> 1=>2=>3=>4 ; 1=>2=>3=>5
> if any of the preceding nodes (1 or 2 or 3) do not equal there counterpart
> (their equally named node) then node 4 and 5 reside in different chains
> confused by the homonym.
>
> there are NO exceptions
> customer=>order=>lineItem=>part=corn;
> customer=>order=>lineItem=>part=rice;
> if the customer is different then both chains are different.
> if the order is different then both chains are different.
>
>
> meta-blocks:
> fragments of chains can be peers or independent.
> take for example the fragment [lineItem=>part]
> in an equivalent chain fragment [customer=>order] each succeeding fragment
> (in accordance with the syntax) are peers.
>
> so i conclude that
> Given 2 sequences by name alone (meta trails) one can not differentiate
> peer chains from homonym chains. Because it is the meta trails alone that
> provide the syntax [the grammatical format] then that format (in this case
> XML) has to provide a means to disambiguate peer chains from homonym
> chains.I will further state that these are business decisions to be made.
>
> my solution is to in all cases have the elements KRI* follow the element
> meta name during parsing*2.
> customer-John344-order-12/12/2020-*-lineItem-1-part-part1=corn;
> customer-Bill-order-12/12/2020-*-lineItem-1-part-part10=rice;
>
> the "-*-" signifies that element lineItem is a construct.
>
> parsing*2 - this could be accomplished in several ways:
> 1) adding an attribute to the element <customer KRI='John344'> OR
> 2) following the element name with its KRI
> <customer>
>     <KRI>John344</KRI>
>     <order><KRI>12/12/2020</KRI>
>
> KRI*: the Knowledge Reference Identifier should be a unique identifier
> that points to a profile of this node.
> the data purest in me has always thought that the attributes of an element
> ARE its signature. So I prefer solution 1.
>
>         Somewhere is a <profile ID=‘John344’> … that defines this element
> like no other
>         What I believe that means is that goal[1] is not enough. It would
> be enough for an single RDF graph but not for a transfer or meshing of
> graphs
>
>                 If we added actions to RDF like “equate” then we can
> combine elements from one graph to another “name@3” equates to
> “organization@9000”.
>         One could then create a super language (Or as Carl Mattocks calls
> it an Owl-light like language) for meshing or transfer.  (where StratML
> names become the super language)
>
> So - i need help here guys. is this an XML question with an XML solution?
> Though, I think an XML solution would not be complete. the KRI and the
> signature profile needs a business solution (IE StratML extensions) and
> perhaps even AIKR extensions.
>
> thoughts?
>
> Paul Alagna
>
>
>
Received on Monday, 16 March 2020 01:12:09 UTC