Re: Blank Nodes Re: Toward easier RDF: a proposal

Yes.
All very much agreed.

Especially 
> But we *can*
> give users better support for collapsing duplicate nodes if
> they follow good practices such as indicating keys -- provided
> that we devise a *convenient* notation for them to do so.

And with guidance for different types of things, and best practice.
And maybe even collect together libraries that capture that best practice in code,
which cause collisions across wider datasets.

Essentially pushing out what many developers must be doing internally to the external world.

Best
Hugh

> On 30 Nov 2018, at 02:29, David Booth <david@dbooth.org> wrote:
> 
> On 11/29/18 6:12 PM, Hugh Glaser wrote:
> >> David Booth wrote:
> >> In my own experience, objects composed of literal attributes
> >> like this generally *do* form a composite key, though
> >> perhaps other RDF developers have had different experience.
> >
> > Since you ask :-)
> > I'm sorry to report that my experience is that they often don't.
> 
> Fair enough.  I guess it depends a lot on the data origins.
> 
> > . . .  So it would be folly to try to automagically generate URIs
> > for such bNodes in general - generated unique URIs or sufficiently
> > large random ones is the best that you can do.
> 
> Agreed.  But I think my point still holds if a (composite) key
> can be conveniently indicated.
> 
> > I think that if you consider *all* the properties, essentially
> > the SCBD, you might get away with it, almost always.
> > (As someone else pointed out.)
> 
> I think that would be too risky to rely on.
> 
> > . . . postal address
> > gets more difficult.  For a start, it is very unusual from
> > disparate datasets to get such well and uniformly formatted
> > addresses.  . . .
> 
> Agreed.  Data cleaning and normalizing issues will always exist.
> But I think that's kind of a separate issue.  We cannot expect
> to automagically clean up people's dirty data.  But we *can*
> give users better support for collapsing duplicate nodes if
> they follow good practices such as indicating keys -- provided
> that we devise a *convenient* notation for them to do so.
> 
> > And you need to decide what to do about missing fields.
> 
> Missing key fields would prevent a standardized URI from
> being generated.  Missing non-key fields would have no impact.
> 
> David Booth
> 

Received on Friday, 30 November 2018 09:47:37 UTC