Re: [External] : Future-proof modelling from Pierre-Antoine Champin on 2023-01-23 (public-rdf-star-wg@w3.org from January 2023)

From: Pierre-Antoine Champin <pierre-antoine@w3.org>
Date: Mon, 23 Jan 2023 17:08:32 +0100
To: Franconi Enrico <franconi@inf.unibz.it>, Souripriya Das <souripriya.das@oracle.com>
Cc: RDF-star WG <public-rdf-star-wg@w3.org>
Message-ID: <33c560fd-3cb6-1f27-0af2-ef330489beff@w3.org>


On 23/01/2023 16:08, Franconi Enrico wrote:
> Hi Souri,
>
>> Contrast the above with the type of situation I was trying to 
>> illustrate in my slides. There, the new data that is coming in is 
>> using the same properties with the same respective domains and ranges 
>> as before, but its arrival has caused the occurrence counts to go 
>> from one to greater-than-one for some of the properties (e.g., 
>> suppose that we just found out that :Taylor :married :Burton a second 
>> time -- something that never happened to the :married property before 
>> this). This addition became a reality in the world being modeled -- 
>> the data architect has no control over this. Such changes can and 
>> should be handled as seamlessly as possible -- pre-existing queries 
>> should retain their validity despite those changes. This is where 
>> named triples -- with support for both implicit and explicit names -- 
>> would come in handy.
>
> I guess that the Example 1 by Pierre-Antoine is exactly about this.
> In the relational model, you would have a table Person with a column 
> Name (PK), a column Married (with FK to Person.name), and, say, a 
> column Address.
> You start by stating that the tuples <“Richard”, “Liz”, “Addr1"> 
> <“Liz", “Richard”, “Addr2"> are in the table.
> You then realise that there are two distinct occurrences of the 
> marriage, and therefore you have to change the schema by deleting the 
> column Married from Person, and adding a table Marriage with two 
> attributes FKs to Person /_and_/ additional attributes identifying 
> each distinct marriage (e.g., the date of the wedding, and/or the 
> marriage period, etc).
> Of course, you could have a Marriage table since the very beginning, 
> and use a “surrogate key” of this table to identify the distinct 
> marriages, but this is possible in the relational model since it 
> allows to model n-ary relations with n>2, and not by allowing 
> multisets (bags). Bags are meaningful in the relational model ONLY if 
> the origin of the multiplicity is known (e.g., the bag of salaries of 
> each person in the original table, obtained by a projection from a 
> bag-free relation of persons and their salaries). A table T with 
> repeating tuples /given a-priori/ is /semantically/ indistinguishable 
> from (SELECT DISTINCT * FROM T) in the relational model: you can't 
> understand which “hidden” attribute would disambiguate the identity of 
> the tuples (multiple marriages? multiple addresses? etc). In general, 
> it is well known that bags destroy the basic principle of the 
> relational model as a modelling language, which associates an identity 
> to tuples in 5th normal form, like the above.
> And I don’t believe we should assume in RDF that a graph can be 
> a-priori a bag of triples.
> —e.

what Enrico said! :)

Attachments

application/pgp-keys attachment: OpenPGP public key

Received on Monday, 23 January 2023 16:08:37 UTC