RE: Profiles in Linked Data from Svensson, Lars on 2015-05-18 (public-lod@w3.org from May 2015)

From: Svensson, Lars <L.Svensson@dnb.de>
Date: Mon, 18 May 2015 16:41:14 +0000
To: Martynas Jusevičius <martynas@graphity.org>
Cc: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <24637769D123E644A105A0AF0E1F92EF010CF6B9BA@dnbf-ex1.AD.DDB.DE>
Martynas,

On Monday, May 18, 2015 5:33 PM, Martynas Jusevičius wrote:

> yes, SPIN is a machine-readable way to describe RDF constraints.

OK, but as far as I've understood it's not the only one and the data shape WG still has to make a decision which should be the canonical way of doing that.

> What I still don't understand is why the client gets to choose the
> constraint profile. Isn't it the responsibility of the data receiver,
> in this case, the Linked Data server?

The Linked Data server can serve different communities preferring different ways of describing the same entity. One case could be the description of constraints for RDF: Do you want the constraints using the SPIN SPARQL Syntax, using OWL or using Shape Expressions 1.0? The server wants to be interoperable so it serves the profile information in three different documents (one for each syntax). All documents are RDF documents that can be serialised in RDF/XML, Turtle or N-Triples, so you cannot pick your vocabulary by negotiating on content-type.

Another case is the current landscape of bibliographic data. There is currently no "best" model for describing this kind of information. A major data provider (mdp) might not want to commit to just one of those models since that would mean that a large portion of relevant information would be available in just one "profile" which makes it more difficult for other organisations  to evaluate pros and cons of the different models. Instead the mdp will want to supply data about the same entities using different "profiles" (e. g. BibFrame, RDA and BiBo) thus giving other players investigating the potential of those models a possibility to have a real large dataset to experiment with.

That's why the client needs a way to specify how it wants its data served. Look at it as a service to the customers, giving them more options to choose from.

> Using your previous FOAF/GNDO example, could you illustrate what
> constraints would go into profile A and what into profile B?

In natural language (I'm not that well versed in SPIN...)

http://example.org/profiles/A

A URI representing a person will be of rdf:type foaf:Person;
it will have exactly one foaf:birthday;
it will have zero or one foaf:dnaChecksum;
it will have exactly one foaf:name .
That is all the data there will about persons in this profile.

http://example.org/profiles/B

A URI representing a person will be of rdf:type gndo:DifferentiatedPerson;
it will have one or more gndo:dateOfBirth;
it will have exactly one gnde:preferredNameForThePerson;
it will have exactly one gndo:gndIdentifier.
That is all the data there will be about persons in this profile.

> If for example profile A says that foaf:Person instances must have
> mandatory foaf:familyName and foaf:givenName while profile B does not
> include this constraint, then you have a potentially conflicting model
> of your data.

It is really about which vocabularies you use to describe it. There might be intersections (e. g. several profiles can use dc:title). A server can of course also commit to serving data according to a third party profile, e. g. a (fictitious) DCAT-profile defined by W3C [1] or the existing one defined by the EU [2].

[1] http://www.w3.org/TR/vocab-dcat/#conformance has text about DCAT profiles. I see a possibility that individual organisations might want to formalise their profiles.
[2] https://joinup.ec.europa.eu/asset/dcat_application_profile/description


I hope this helps to explain what I'm aiming at.

Best,

Lars
Received on Monday, 18 May 2015 16:41:47 UTC