KRID from Paul Alagna on 2020-05-23 (public-aikr@w3.org from May 2020)

From: Paul Alagna <pjalagna@gmail.com>
Date: Sat, 23 May 2020 16:17:29 -0400
To: W3C AIKR CG <public-aikr@w3.org>
Message-Id: <76DBE2AA-8575-41B9-9A35-CB7A6E6CAD9D@gmail.com>

KRID
A Knowledge Resource IDentifier is the unique name for a particular Instance of discourse.
It thus becomes a reference point for any discussion of that instance. If this KRID is stable then it becomes THE reference point for ALL discussions of that instance.

In an XML report knowledge is gathered into its repository (roughly a storage array)
Information like:
Operational information - its position in the report
And some meta data like tag-name, attributes and values, value after the tag, peer resolution, etc.
All of this information needs to be collected under an instance identifier. The KRID developed for an XML report is by nature unstable. It only exists for the moment of the XML.

BUT we can find a stable KRID IF add more information to our repository by including the XSD’s knowledge. Like a definition identifier and its text. And it’s this definition identifier as a KRID that is stable across ALL XML reports in this XSD’s format.

One of the Goals of The AIKR group is to:
Discover, name and define the usage of XSD information items that can be added to XML repositories. In AI (machine learning or neural networks) this KRID names the input/information feed CONSISTENTLY for ALL XML reports in this XSD format.

Other information comes from the XSD that can be applied to the XML report. Like
- a format identifier and its text,
- The provenance (history or lineage) of this “Tag”, the parsing workflow.
format : is this date month/day/year? Or year/month/day? Is this 24 hour time? Or AM/PM?
Provenance: that a “Goal” has a string format is useful but answering “Who’s goal under WHAT circumstances” is important too. (see also below for more). NOTE: To provide provenance AIKR processes MUST follow the rules laid out by our standards .
- The Parsing workflow: with the advent of “wizards” the validation has been “worked out” (well until the next version of the XSD comes out) . “worked out” also means that Validation is reduced to a developer's afterthought. But with KRID based repositories we can perform extra-session analysis across XML reports.
like:
Rooting: reducing words to their roots by removing the prefix and suffix. “Precasts” becomes ‘cast’. Also capitalization is reduced to lower case. This produces a set of search contenders.

Acronymization:
coring: cores like the dublin core have definitions for known phrases like :
“Author of” or
“Date of publication”
Industrial cores: all industries have their jargon (a tennis court is very different from a municipal court , “playing ball” means different things )
Categorization: by usage of different cores for acronymization industrial categories can be registered.

Patterns: phone numbers have a unique pattern, as do zip codes, URI’s and filenames. All of these add to our knowledge.

Framing and frame completion:
Framing simply is the discovery of the largest domain under a reduced tag.
A “reduced tag” is an endpoint in a provenance chain.
This point can have several value choices, the collection of which is called its domain.
From one XML to the next XML the largest set can be collected. Thereafter this set can be offered to the XML creator as a possible set of options. OR offered to the XML report creator as an option for expansion or improvement.

definitions:
In data science we acknowledge:
Dominion: the “table” in an SQL Database is the dominion of discourse. In object class definition we acknowledge the class to be the dominion.

Attribute: column names in an SQL database.

Peers: a set of attributes associated with a dominion.

Attribute;value pair: an attribute AND its value (one of its Domain values)

Dominion instance or instance: An instance is a named or keyed set of peer attributes in a Dominion

Instance key: the unique name or set of attribute;value pairs that uniquely identify this instance.

Domain: for each attribute there can exist a set of values called the “domain” of that attribute.

Provenance:
https://www.w3.org/TR/prov-aq/ <https://www.w3.org/TR/prov-aq/>
Provenance records for dynamic and context-dependent resources are possible through a notion of constrained resources. A constrained resource <https://www.w3.org/TR/prov-aq/#dfn-constrained-resource> is simply a resource (in the sense defined by [WEBARCH <https://www.w3.org/TR/prov-aq/#bib-WEBARCH>], section 2.2 <http://www.w3.org/TR/webarch/#id-resources>) that is a specialization or instance of some other resource. For example, a W3C specification typically undergoes several public revisions before it is finalized. A URI that refers to the "current" revision might be thought of as denoting the specification throughout its lifetime. Each individual revision would also have its own target-URI <https://www.w3.org/TR/prov-aq/#dfn-target-uri> denoting the specification at that particular stage in its development. Using these, we can make provenance assertions that a particular revision was published on a particular date, and was last modified by a particular editor

Paul
Thoughts? , comments?

Thanks
PAUL ALAGNA
PJAlagna@Gmail.com <mailto:PJAlagna@gmail.com>
732-322-5641

Received on Saturday, 23 May 2020 20:17:47 UTC