W3C home > Mailing lists > Public > semantic-web@w3.org > September 2019

Re: New avenues for research - integrating CT, HoTT, RDF, SQL, SPARQL, Scala, ... was: The Joy of NULLs

From: Sebastian Samaruga <ssamarug@gmail.com>
Date: Sun, 1 Sep 2019 10:51:43 -0300
Message-ID: <CAOLUXBuH33vWRajfNHfdZNu_52yo7iHyRh2_qTbnUD-Wd+4hgg@mail.gmail.com>
To: Henry Story <henry.story@bblfish.net>
Cc: W3C Semantic Web IG <semantic-web@w3.org>
Henry, thanks. I really appreciate for your time and patience. Seems a lot
to learn so let me ask some things first so I'll narrow my search, mostly
because I've last seen math in highschool.

My "scope" is to enable databases and services to interact with each other
in an ESB like manner, the scope of the meta model is to enable this by
means of translation between ontologies and ontology matching. Later I'll
try to find how build "Adapters" for whichever plugged backend "Contexts"
(models) will be necessary using (I think) a model of events for "reactive"
updates.

My first and only requirement was to model all of this in a single meta
model encoding data, schema and behavior constructs inferred from the
integrated metadata of databases, backends, services, etc. Then I came to a
bare understanding of "monads" which seems to allow me to state, for
example, that what I call a "Flow" is a "Behavior" of some sort (inferred
from a series of previously aggregated layers) of which one can assert
transformations in the domain of its kind. For example, someone buying
something (Flow) aggregates to the category of Buy / Transaction (Behavior)..

Something what I'm interested in is in how categories can "activate" types
in their actions when they are "available" (schema contexts and state).
This in the aims of building a REST HATEOAS (HAL like) integration /
uniform protocol for different backends.

My question is if my "ontology" of layers and contexts resembles something
that categories / monads may leverage. The intention is to be aggregating
from "raw" types (URIResource plain SPO statements) to augmented (super)
types in a hierarchical wrapping from more primitive types to (super)
concepts wrapping occurrences of the previous layers concepts, like in the
buying example.

Let me try to state the "ontology" in this kind of encoding / notation
showing the "shape" of layers / contexts statements (aggregated RDF Quads /
model layers):

(C, S, P, O); Types of quad components.

Aggregation:

(Context, Occurrence, Attribute, Value);

"Roles" of quad components. When aggregating a Subject (Occurrence),
Predicate (Attribute), Object (Value). When aggregating a Predicate,
Subject (Attribute), Object (Value). When aggregating an Object, Subject
(Attribute), Predicate (Value).

This is mean to enable some sort of "semiotic" inference parsing the quads
in the "shape":

(Context, Sign, Concept, Object);


Meta Model Quads layers:

(URIResource, URIResource, URIResource, URIResource);

CSPO form. RESTful / HAL monad: HTTP category functors.

(OntResource, URIResource, URIResource, URIResource);

Aggregated URIResource OntResource attributes / values (recursion to
attributes / values OntResource). Aligned (matched) URIResource(s).

(Transform, OntResource, URIResource, URIResource);

(Mapping, Transform, OntResource, URIResource);

(Template, Mapping, Transform, OntResource);

(Augmentation, Template, Mapping, Transform);

Functor "applications".

(Message, Augmentation, Template, Mapping);

Functor "declarations".

(Context, Message, Augmentation, Template);

Model.

(Resource, Context, Message, Augmentation);

OntResource (aligned / matched URIResources) occurrences in reified Role in
Statement.

(Role, Resource, Context, Message);

Reified CSPO / Resource, Occurrence, Attribute, Value Resource role types
in Resource occurrence / context.

(Statement, Role, Resource, Context);

(Entity, Statement, Role, Resource);

Aggregated "subject" occurrences of Resource in Role in Statement(s).

(Class, Entity, Statement, Role);

Aggregated Entity Class occurrences type (attributes).

(Kind, Class, Entity, Statement);

Aggregated kinds / roles ("interfaces") of Class occurrences.

(Flow, Kind, Class, Entity);

Action "instance". Entity of Class performs role (Kind) of Behavior Flow.

(Behavior, Flow, Class, Kind);

Action "class". Statements: propositions, prescriptions, rules,
productions. DCI / Link Grammar. Context satisfaction (rules).

(Measure, Behavior, Flow, Class);

(Unit, Measure, Behavior, Flow);

(Dimension, Unit, Measure, Behavior);

Order.

Best regards,
Sebastián.
http://snxama.blogspot.com


On Sun, Sep 1, 2019, 4:36 AM Henry Story <henry.story@bblfish.net> wrote:

>
>
> > On 31 Aug 2019, at 20:05, Sebastian Samaruga <ssamarug@gmail.com> wrote:
> >
> > Hi. Not being a mathematician, nor even having the logics and category
> theory background deserved to the level of this discussions (I'm just a
> developer), I dare to ask here if a problem: ontology matching and systems
> integration could be addressed using RDF and (my bare notion) of Monads. My
> "intuition" tell me yes.
>
> One does not need that much mathematics to understand Category Theory
> nowadays.
> An A level understanding of algebraic equations may be enough to get going.
>
> Indeed CT is now quite common at programming conferences.
> Here is a keynote by Bartosz Millewski ”The Maths Behind the Types”
> at Scala eXchange 2017
> https://skillsmatter.com/skillscasts/10179-the-maths-behind-types
> which is a very good introduction that shows how CT can allow one to
> work with programming types algebraically.
>
> One used to have to understand all of mathematics to understand CT,
> because Mathematicians described their concepts with examples taken from
> mathematics.
> Yet, the key notion of compositionally applies very well to programming,
> and so the concepts can also be re-explained in those terms.
>
> After a career programming in C++ and then discovering how much easier
> it was to do the equivalent of Template Programming in Haskell, Bartosz
> delved into CT and wrote some introductory blog posts which he
> transformed  into a book
> ”Category Theory for Programmers”
> https://twitter.com/hmemcpy/status/1160870623943561216
>
> There are also recorded lectures of courses he gave which I also found
> very helpful.https://www.youtube.com/user/DrBartosz/playlists
>
> The examples in Haskell have been translated in that book into Scala.
>
> It does help a lot to learn one of those languages, to get a practical
> feel for those concepts. There are many libraries there that use
> those concepts, and many blog posts on the subject.
> For example the cats library for Scala
>   https://typelevel.org/cats/
>
> The above shows how important CT is to contemporary programming.
>
> For those coming from the Semantic Web and interested in how
> CT maps to RDF the Thesis by Braatz that Ryan links to below is
> very helpful, and at least the first part is quite easy to follow.
> It is striking how close the key concept of a Category is to the
> notion of a graph used in RDF, and I think that needs some
> careful elaboration. Why are these concepts so close? What is
> the difference, and why is it needed?
>
> Also what needs elaboration is how one can extend the work of
> Braatz to quads, to SPARQL and indeed to the web, that is
> the linked data part of the semantic web.
>
> Ryan and the group at MIT have been working on tying CT to
> SQL Databases,  and have written papers on the relation to
> RDF to CT. See
> ”Functorial Data Migration”
> https://www.sciencedirect.com/science/article/pii/S0890540112001010
>
> The paper by Steffan Staab is also very interesting as it shows
> how  RDF can be integrated into a programming language like Scala
> https://arxiv.org/abs/1902.00545
> The idea of using T-Boxes at the compiler level, and A-boxes
> as the data (object layer) seems to get the levels right.
> It addresses one of the main question as to how the semantic
> web relates to programming, which is what got me to explore
> the whole subject myself.
>
> What we see here is slowly these parts coming together,
> the relation between programming and CT is well established,
> the relation between CT and RDF has been developed in a number
> of places, and the parallels are striking,
> the relation between Databases and CT that I am just discovering
> is having practical applications and is also a good way to get
> into the subject,
> (see http://math.mit.edu/~dspivak/CT4S.pdf )
>
> But there is still work to be done in bringing all these strands
> together in a way that is easy to understand and can have
> direct practical applications for developers.
>
> [snip content from https://github.com/snxama/scrapbook ]
>
> > > On 29 Aug 2019, at 18:47, Ryan Wisnesky <ryan@conexus.ai> wrote:
> > >
> > > Hi all,
> > >
> > > QINL is simply the name we gave to a common phenomenon in dependent
> type theories, which is that you can usefully represent sets as dependent
> types, instead of as terms.  That has both positive and negative
> implications.  It's possible that Dotty's (upcoming Scala's) type system
> may support QINL, and similarly for Dependent Haskell.  Once you represent
> sets as dependent types, you can e.g. manipulate them using Coq tactics:
> https://www.wisnesky.net/dbpl15.pdf ("Using Dependent Types and Tactics
> to Enable Semantic Optimization of Language-Integrated Queries") and
> https://homes.cs.washington.edu/~chushumo/files/cosette_pldi17.pdf
> ("Homotopy Type Theory SQL: Proving Query Rewrites with Univalent SQL
> Semantics").  Although both QINL and LINQ are most naturally described
> using category theory, conceptually they are about how collections are
> represented in type theory.
> > >
> > > In our work on the categorical query language CQL (
> http://categoricaldata.net), our notion of schema includes
> equationally-defined constraints, sufficient to encode arbitrary behavior
> as functional programs (e.g., there is a CQL schema for SK combinatory
> logic).  This is enabled by CQL's underlying categorical semantics and can
> be implemented in QINL style, although there's no need to do so.
> > >
> > > Henry alluded to the status of blank nodes in RDF, a question answered
> using the language of category theory in the Ph.D. thesis of Braatz :
> https://pdfs.semanticscholar.org/b8c8/5a3e7a04020259ec9a58c7e5563033f52844.pdf
> , presumably in a way equivalent to their intended set-theoretic
> semantics.  That thesis also contains a variety of constructions on RDF
> graphs such as "pushouts" that may or may not be known or useful to the RDF
> community, but whose analogs in other data models are known to be useful
> for data integration.  So I wanted to take this opportunity to ask around
> to see if anyone was interested in further investigating categorical
> constructions for RDF.
> >
> > >
> > > Ryan
> > >
> > >> On Aug 29, 2019, at 6:37 AM, Henry Story <henry.story@bblfish.net>
> wrote:
> > >>
> > >>
> > >>
> > >>> On 26 Aug 2019, at 16:26, Steffen Staab <staab@uni-koblenz.de>
> wrote:
> > >>>
> > >>> Dear Henry,
> > >>>
> > >>> the pointers below seem to be really useful to us.
> > >>> The work on CQL and QINL seems to be very related to our papers
> > >>>
> > >>> ISWC2019: https://arxiv.org/abs/1907.00855
> > >>> Programming 2019: https://arxiv.org/abs/1902.00545
> > >>>
> > >>> where we use ontology concepts as well as queries as types in
> > >>> programming languages.
> > >>
> > >> That last one is a very interesting article linking Scala
> > >> and SPARQL. I completely agree with the described
> > >> limitations of banana-rdf.
> > >>
> > >> This problem of how RDF and Scala fit together has been
> > >> something that has bugged me for a while. Because of the
> > >> strong presence of Functional Programmers in the Scala
> > >> community I have been lead to look at Category Theory
> > >> to look for an answer.
> > >>
> > >> Your work is also very enlightening. I feel we are at the
> > >> cusp of an interesting answer here.
> > >>
> > >>>
> > >>> QINL seems to go one step in this direction
> > >>> taking schemata (not so different from ontology concepts / ER
> Entities)
> > >>> and extending them with behavior.
> > >>>
> > >>> Still, I do not quite understand where the two approaches should
> meet.
> > >>> Any idea?
> > >>
> > >> That is a very good question. I do get the feeling that by answering
> > >> this question we can make some very good progress. Perhaps
> > >> Ryan Wisnesky, can point to an answer here. I will try, but
> > >> it may take me some time to integrate both sides :-)
> > >>
> > >> Ryan pointed me to an article from 2001 ”A Model Theory for Generic
> > >> Schema Management” [1] that is actually an application of Institution
> Theory (IT)
> > >> to Schema management with some very simple Java examples, that make
> > >> IT accessible. The advantage of looking towards IT - the logic of the
> structure
> > >> of all logics [2] - is that it can help one integrate many different
> points of views.
> > >> The advantage of moving up the abstraction layer, is that some
> questions
> > >> that within a domain seem arbitrary - eg the status of blank nodes in
> > >> RDF - can be answered at the higher level by showing how it ties in
> > >> to many other areas of mathematics and engineering in a structured
> > >> way - eg. blank nodes appear as NULLs in a coherent formalization of
> > >> database theory. In mathematics one can ground a problem by
> > >> moving up the abstraction layers, it seems.
> > >>
> > >> Henry
> > >>
> > >> [1]
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.481.7519&rep=rep1&type=pdf
> > >> [2] https://www.iep.utm.edu/insti-th/
> > >> [3] a page with many links to Categorical Query Language
> > >>   https://www.categoricaldata.net/papers
> > >>
> > >> PS. Sorry for taking so long to answer. 1. It is taking time to
> integrate
> > >> all these papers, and 2. I keep having to do Scala programming tests
> for
> > >> job interviews to prove I can code!
> > >> If anyone has a need for a Scala dev who understands RDF and
> > >> some CT, please let me know :-)
> > >>
> > >>>
> > >>> Cheers
> > >>> Steffen
> > >>>
> > >>>
> > >>>> Am 25.08.2019 um 07:19 schrieb Henry Story <henry.story@bblfish.net
> >:
> > >>>>
> > >>>> Continuing this thread that started with the funny story on the
> NULL
> > >>>> vanity licence plate reported here:
> > >>>>
> https://mashable.com/article/dmv-vanity-license-plate-def-con-backfire/
> > >>>>
> > >>>> I just came across work by Ryan Wisnesky on Algebraic Databases,
> where
> > >>>> the authors formalizes DBs in terms of Category Theory, in order to
> build provably
> > >>>> correct ways to transform data.
> > >>>>
> > >>>> In that formalization, for which they have software tools, they
> give an clear
> > >>>> explanation of NULLs in SQL databases that make each
> > >>>> NULL different.  In the talk linked to below Ryan Wisnesky actually
> gives them
> > >>>> different  subscripts.
> > >>>>
> > >>>> In that way nulls  in DBs are very different from nulls in
> > >>>> Java - which can be compared for equality  and for which there
> exists only one
> > >>>> instance -  and very similar to blank nodes on the semantic web.
> > >>>>
> > >>>> See the presentation ”Algebraic Databases” on his web site
> > >>>>    https://www.wisnesky.net/
> > >>>> Or other content I found on this work
> > >>>>    https://twitter.com/bblfish/status/1165195822625153024
> > >>>>
> > >>>> Henry Story
> > >>>>
> > >>>>
> > >>>>> On 13 Aug 2019, at 15:53, Daniel Hernandez <daniel@degu.cl> wrote:
> > >>>>>
> > >>>>> SQL nulls are similar in some aspects to Codd nulls. A difference
> is that SQL nulls do no provide guaranty that the value exists. Blank
> nodes, on the other hand, are similar to marked nulls. We study the
> application to SPARQL of SQL techniques to approximate certain answers in:
> "Certain Answers for SPARQL with Blank Nodes." However, we founded a unique
> dataset using blank nodes as unknown values (Wikidata). I am curious if you
> know another.
> > >>>>>
> > >>>>> On Tue, Aug 13, 2019 at 3:53 AM, Franconi Enrico <
> franconi@inf.unibz.it> wrote:
> > >>>>>> The situation is slightly more complex than that.
> > >>>>>> NULL values in standard SQL are exactly defined as letting any
> equality involving a NULL value fail.
> > >>>>>> Note that the string 'NULL' represents a NULL value, namely if
> you type the string NULL into a cell of type STRING then it is understood
> to be a NULL value.
> > >>>>>> This is where the implementors failed: a NULL value is never
> equal to itself.
> > >>>>>> This can be understood with the following standard SQL example
> (try it!).
> > >>>>>>
> > >>>>>> With the database:
> > >>>>>>
> > >>>>>> TABLE: col1 | col2
> > >>>>>>      -----+-----
> > >>>>>>        a  |  b
> > >>>>>>        b  | NULL
> > >>>>>>
> > >>>>>> the query (meant to be the identity query, namely returning the
> input table itself):
> > >>>>>>
> > >>>>>> SELECT * FROM TABLE
> > >>>>>> WHERE TABLE.col1 = TABLE.col1 AND TABLE.col2 = TABLE.col2 ;
> > >>>>>>
> > >>>>>> gives the result:
> > >>>>>>
> > >>>>>> col1 | col2
> > >>>>>> -----+-----
> > >>>>>> a  |  b
> > >>>>>>
> > >>>>>> In SQL, the query above returns the table TABLE if and only if
> the table TABLE does not have any NULL value, otherwise it returns just the
> tuples not containing a NULL value, i.e., in this case only the first tuple
> <a,b>. Informally this is due to the fact that a SQL NULL value is never
> equal (or not equal) to anything, including itself. This is because a SQL
> NULL value represents the absence of a value.
> > >>>>>>
> > >>>>>> Note that this is where SQL NULL values are radically different
> from RDF bnodes. Indeed a bnode is EQUAL to itself but different from any
> other bnode. This is because a RDF bnode represents the existence of an
> unknown value.
> > >>>>>>
> > >>>>>> --e.
> > >>>>>>
> > >>>>>>> Il giorno 12 ago 2019, alle ore 16:41, Diogo FC Patrao <
> djogopatrao@gmail.com> ha scritto:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Vanity license plates in USA are strings, right? Then this
> problem would only happen if NULL='NULL', which is not.
> > >>>>>>>
> > >>>>>>> It could be that the private company stored 'NULL' instead of
> NULL to the unassigned tickets, but that's really bad coding/design (and
> easy to fix, I guess).
> > >>>>>>>
> > >>>>>>> Or maybe the DAO wrongly translate NULL to 'NULL' at some point..
> > >>>>>>>
> > >>>>>>> Cheers
> > >>>>>>>
> > >>>>>>> dfcp
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> diogo patrão
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Mon, Aug 12, 2019 at 11:11 AM Young,Jeff (OR) <
> jyoung@oclc.org> wrote:
> > >>>>>>> Here’s an example showing blank nodes being used to declare the
> place of birth is unknown in Wikidata:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> https://w.wiki/6$y
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> In the UI, it is rendered like this:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> <image001.png>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Jeff
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> From: Daniel Hernandez <daniel@degu.cl>
> > >>>>>>> Date: Monday, August 12, 2019 at 9:42 AM
> > >>>>>>> To: "semantic-web@w3.org" <semantic-web@w3.org>
> > >>>>>>> Subject: [External] Re: The Joy of NULLs (not)
> > >>>>>>> Resent-From: <semantic-web@w3.org>
> > >>>>>>> Resent-Date: Monday, August 12, 2019 at 9:37 AM
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> As Enrico pointed, blank nodes can be used to represent unknown
> values.
> > >>>>>>> An example of this use is Wikidata. I don't know another example.
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Daniel
> > >>>>>>>
> > >>>>>>> On Mon, 12 Aug 2019 07:36:41 +0000
> > >>>>>>> Franconi Enrico <franconi@inf.unibz.it> wrote:
> > >>>>>>>
> > >>>>>>>> Mike, this could easily happen in an RDF world if you register a
> > >>>>>>>> vanity licence plate with anything starting with "_". Indeed,
> bnodes
> > >>>>>>>> would be the right way to represent unknown but existing
> plates. --e.
> > >>>>>>>>
> > >>>>>>>> Il giorno 11 ago 2019, alle ore 23:10, Michael F Uschold
> > >>>>>>>> <uschold@gmail.com<mailto:uschold@gmail.com>> ha scritto:
> > >>>>>>>>
> > >>>>>>>>> This is hilarious. It could never happen in an RDF world! No
> value,
> > >>>>>>>>> no triple.
> > >>>>>>>>>
> > >>>>>>>>> He tried to prank the DMV. Then his vanity license plate
> backfired
> > >>>>>>>>> big time.
> > >>>>>>>>>
> https://mashable.com/article/dmv-vanity-license-plate-def-con-backfire/<
> http://flip.it/NIk7FD>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>
>
>
>
Received on Sunday, 1 September 2019 13:53:39 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:51:38 UTC