Re: Blank Nodes Re: Toward easier RDF: a proposal from Anthony Moretti on 2018-11-30 (semantic-web@w3.org from November 2018)

From: Anthony Moretti <anthony.moretti@gmail.com>
Date: Fri, 30 Nov 2018 02:06:12 -0800
To: hugh@glasers.org
Cc: david@dbooth.org, Semantic Web <semantic-web@w3.org>
Message-ID: <CACusdfSus0kqiVCPazcHd+v7CiO0ze0jgZWCezrBBte2mPkOBg@mail.gmail.com>
I think there is an analogy to this in Swift and C# (and maybe other
languages, these are just ones I'm familiar with):

   - An IRI is represented by a *class* instance
   - A blank node is represented by a *struct* instance

Swift:

class SomeClass {
// class definition goes here
}

struct SomeStructure {
// structure definition goes here
}


Classes are reference types, you check if one variable is *identical
to* another
variable using three equals signs, ===. This checks if they refer to
exactly the same class instance. To me this is the same as checking whether
they both refer to the same IRI.

Structs are value types, you check if one variable is *equal to* another
variable using two equals signs, ==. This checks if they are considered
equal or equivalent in value, and what qualifies two instances being equal
must be defined for every type. To me this is the same as having two blank
nodes of type Address and having an agreed upon method of checking whether
two Address instances are equal.

If you have:

struct Address {

var streetAddress: String

}

var a = Address(streetAddress: "123 Sesame St")

var b = Address(streetAddress: "123 Sesame Street")


Then it would be up to the implementation of the *equals to* method,
==, within Address whether a == b would return true or false.

In summary, the way this is addressed is by explicitly stating *as part of
the language* that blank nodes can be compared by their composition, but
that the process of comparison must be defined for each type.

Anthony

Some relevant links here:

   -
   https://docs.swift.org/swift-book/LanguageGuide/ClassesAndStructures.html
   -
   https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/



On Fri, Nov 30, 2018 at 1:54 AM Hugh Glaser <hugh@glasers.org> wrote:

> Yes.
> All very much agreed.
>
> Especially
> > But we *can*
> > give users better support for collapsing duplicate nodes if
> > they follow good practices such as indicating keys -- provided
> > that we devise a *convenient* notation for them to do so.
>
> And with guidance for different types of things, and best practice.
> And maybe even collect together libraries that capture that best practice
> in code,
> which cause collisions across wider datasets.
>
> Essentially pushing out what many developers must be doing internally to
> the external world.
>
> Best
> Hugh
>
> > On 30 Nov 2018, at 02:29, David Booth <david@dbooth.org> wrote:
> >
> > On 11/29/18 6:12 PM, Hugh Glaser wrote:
> > >> David Booth wrote:
> > >> In my own experience, objects composed of literal attributes
> > >> like this generally *do* form a composite key, though
> > >> perhaps other RDF developers have had different experience.
> > >
> > > Since you ask :-)
> > > I'm sorry to report that my experience is that they often don't.
> >
> > Fair enough.  I guess it depends a lot on the data origins.
> >
> > > . . .  So it would be folly to try to automagically generate URIs
> > > for such bNodes in general - generated unique URIs or sufficiently
> > > large random ones is the best that you can do.
> >
> > Agreed.  But I think my point still holds if a (composite) key
> > can be conveniently indicated.
> >
> > > I think that if you consider *all* the properties, essentially
> > > the SCBD, you might get away with it, almost always.
> > > (As someone else pointed out.)
> >
> > I think that would be too risky to rely on.
> >
> > > . . . postal address
> > > gets more difficult.  For a start, it is very unusual from
> > > disparate datasets to get such well and uniformly formatted
> > > addresses.  . . .
> >
> > Agreed.  Data cleaning and normalizing issues will always exist.
> > But I think that's kind of a separate issue.  We cannot expect
> > to automagically clean up people's dirty data.  But we *can*
> > give users better support for collapsing duplicate nodes if
> > they follow good practices such as indicating keys -- provided
> > that we devise a *convenient* notation for them to do so.
> >
> > > And you need to decide what to do about missing fields.
> >
> > Missing key fields would prevent a standardized URI from
> > being generated.  Missing non-key fields would have no impact.
> >
> > David Booth
> >
>
>
>
Received on Friday, 30 November 2018 10:06:48 UTC