W3C home > Mailing lists > Public > semantic-web@w3.org > February 2020

Typed RDF, OWL and Scala was: The Joy of NULLs (not)

From: Henry Story <henry.story@bblfish.net>
Date: Mon, 10 Feb 2020 11:12:55 +0100
Cc: Daniel Hernandez <daniel@degu.cl>, "semantic-web@w3.org" <semantic-web@w3.org>, Ryan Wisnesky <ryan@conexus.com>
Message-Id: <850D7828-FF60-49BC-8904-99AAF3FE9589@bblfish.net>
To: "Prof Dr. Steffen Staab" <staab@uni-koblenz.de>
Dear Prof Steffen Staab,

> On 26 Aug 2019, at 16:26, Steffen Staab <staab@uni-koblenz.de> wrote:
> 
> Dear Henry,
> 
> the pointers below seem to be really useful to us.
> The work on CQL and QINL seems to be very related to our papers
> 
> ISWC2019: https://arxiv.org/abs/1907.00855
> Programming 2019: https://arxiv.org/abs/1902.00545
> 
> where we use ontology concepts as well as queries as types in 
> programming languages.

Over the past few month I have been studying David Spivak and Ryan Wisnesky’s work on Functorial Ontology Logics (ologs). Doing that
I came across Evan Patterson’s article ”Knowledge Representation in Bicategories of Relations” [1] which adapts just slightly that work
by moving sideways to the bicategories of relations. This is a
2-category where objects are sets and arrows are relations
between those sets. It is a 2-category because there are also
arrows between the arrows which we know as rdf:subPropertyOf.

I mention it now in this old thread, because it results in a
typed RDF which also covers OWL inferencing, and so it should
actually fit very nicely with your work on integrating RDF into
a typed language such as Scala.

I’ll see if I can develop this in the question on the subject
of typed RDF that I opened recently 
  https://gitlab.com/web-cats/CG/issues/9

Given that it is also very close to the ologs that Ryan is working
on, it should be possible also to see how it ties in with query
languages such as CQL, and/or adapt that work to bicategories of 
relations that are closer to the relational nature of RDF, and
so would be more intuitive for this community.

Of note is that the structure of RDF - the Grothendieck construction 
of a database instance - plays a key role in defining queries in the
olog view of databases.

Henry

[1] https://arxiv.org/abs/1706.00526


> 
> QINL seems to go one step in this direction
> taking schemata (not so different from ontology concepts / ER Entities) 
> and extending them with behavior.
> 
> Still, I do not quite understand where the two approaches should meet.
> Any idea?
> 
> Cheers
> Steffen
> 
> 
>> Am 25.08.2019 um 07:19 schrieb Henry Story <henry.story@bblfish.net>:
>> 
>> Continuing this thread that started with the funny story on the NULL 
>> vanity licence plate reported here:
>>    https://mashable.com/article/dmv-vanity-license-plate-def-con-backfire/
>> 
>> I just came across work by Ryan Wisnesky on Algebraic Databases, where
>> the authors formalizes DBs in terms of Category Theory, in order to build provably 
>> correct ways to transform data.  
>> 
>> In that formalization, for which they have software tools, they give an clear 
>> explanation of NULLs in SQL databases that make each 
>> NULL different.  In the talk linked to below Ryan Wisnesky actually gives them  
>> different  subscripts. 
>> 
>> In that way nulls  in DBs are very different from nulls in 
>> Java - which can be compared for equality  and for which there exists only one 
>> instance -  and very similar to blank nodes on the semantic web.
>> 
>> See the presentation ”Algebraic Databases” on his web site
>>      https://www.wisnesky.net/
>> Or other content I found on this work
>>      https://twitter.com/bblfish/status/1165195822625153024
>> 
>> Henry Story
>> 
>> 
>>> On 13 Aug 2019, at 15:53, Daniel Hernandez <daniel@degu.cl> wrote:
>>> 
>>> SQL nulls are similar in some aspects to Codd nulls. A difference is that SQL nulls do no provide guaranty that the value exists. Blank nodes, on the other hand, are similar to marked nulls. We study the application to SPARQL of SQL techniques to approximate certain answers in: "Certain Answers for SPARQL with Blank Nodes." However, we founded a unique dataset using blank nodes as unknown values (Wikidata). I am curious if you know another.
>>> 
>>> On Tue, Aug 13, 2019 at 3:53 AM, Franconi Enrico <franconi@inf.unibz.it> wrote:
>>>> The situation is slightly more complex than that. 
>>>> NULL values in standard SQL are exactly defined as letting any equality involving a NULL value fail.
>>>> Note that the string 'NULL' represents a NULL value, namely if you type the string NULL into a cell of type STRING then it is understood to be a NULL value. 
>>>> This is where the implementors failed: a NULL value is never equal to itself.
>>>> This can be understood with the following standard SQL example (try it!).
>>>> 
>>>> With the database:
>>>> 
>>>> TABLE: col1 | col2
>>>>        -----+-----
>>>>          a  |  b
>>>>          b  | NULL
>>>> 
>>>> the query (meant to be the identity query, namely returning the input table itself):
>>>> 
>>>> SELECT * FROM TABLE 
>>>> WHERE TABLE.col1 = TABLE.col1 AND TABLE.col2 = TABLE.col2 ;
>>>> 
>>>> gives the result:
>>>> 
>>>> col1 | col2
>>>> -----+-----
>>>>   a  |  b
>>>> 
>>>> In SQL, the query above returns the table TABLE if and only if the table TABLE does not have any NULL value, otherwise it returns just the tuples not containing a NULL value, i.e., in this case only the first tuple <a,b>. Informally this is due to the fact that a SQL NULL value is never equal (or not equal) to anything, including itself. This is because a SQL NULL value represents the absence of a value.
>>>> 
>>>> Note that this is where SQL NULL values are radically different from RDF bnodes. Indeed a bnode is EQUAL to itself but different from any other bnode. This is because a RDF bnode represents the existence of an unknown value.
>>>> 
>>>> --e.
>>>> 
>>>>> Il giorno 12 ago 2019, alle ore 16:41, Diogo FC Patrao <djogopatrao@gmail.com> ha scritto:
>>>>> 
>>>>> 
>>>>> Vanity license plates in USA are strings, right? Then this problem would only happen if NULL='NULL', which is not.
>>>>> 
>>>>> It could be that the private company stored 'NULL' instead of NULL to the unassigned tickets, but that's really bad coding/design (and easy to fix, I guess).
>>>>> 
>>>>> Or maybe the DAO wrongly translate NULL to 'NULL' at some point.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> dfcp
>>>>> 
>>>>> --
>>>>> diogo patrão
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Aug 12, 2019 at 11:11 AM Young,Jeff (OR) <jyoung@oclc.org> wrote:
>>>>> Here’s an example showing blank nodes being used to declare the place of birth is unknown in Wikidata:
>>>>> 
>>>>>  
>>>>> 
>>>>> https://w.wiki/6$y
>>>>> 
>>>>>  
>>>>> 
>>>>> In the UI, it is rendered like this:
>>>>> 
>>>>>  
>>>>> 
>>>>> <image001.png>
>>>>> 
>>>>>  
>>>>> 
>>>>> Jeff
>>>>> 
>>>>>  
>>>>> 
>>>>> From: Daniel Hernandez <daniel@degu.cl>
>>>>> Date: Monday, August 12, 2019 at 9:42 AM
>>>>> To: "semantic-web@w3.org" <semantic-web@w3.org>
>>>>> Subject: [External] Re: The Joy of NULLs (not)
>>>>> Resent-From: <semantic-web@w3.org>
>>>>> Resent-Date: Monday, August 12, 2019 at 9:37 AM
>>>>> 
>>>>>  
>>>>> 
>>>>> As Enrico pointed, blank nodes can be used to represent unknown values.
>>>>> An example of this use is Wikidata. I don't know another example.
>>>>> 
>>>>> --
>>>>> Daniel
>>>>> 
>>>>> On Mon, 12 Aug 2019 07:36:41 +0000
>>>>> Franconi Enrico <franconi@inf.unibz.it> wrote:
>>>>> 
>>>>> > Mike, this could easily happen in an RDF world if you register a
>>>>> > vanity licence plate with anything starting with "_". Indeed, bnodes
>>>>> > would be the right way to represent unknown but existing plates. --e.
>>>>> > 
>>>>> > Il giorno 11 ago 2019, alle ore 23:10, Michael F Uschold
>>>>> > <uschold@gmail.com<mailto:uschold@gmail.com>> ha scritto:
>>>>> > 
>>>>> >> This is hilarious. It could never happen in an RDF world! No value,
>>>>> >> no triple.
>>>>> >> 
>>>>> >> He tried to prank the DMV. Then his vanity license plate backfired
>>>>> >> big time.
>>>>> >> https://mashable.com/article/dmv-vanity-license-plate-def-con-backfire/<http://flip.it/NIk7FD>
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>> 
> 
Received on Monday, 10 February 2020 10:13:06 UTC

This archive was generated by hypermail 2.4.0 : Monday, 10 February 2020 10:13:07 UTC