Comments on Linked Data Profile 1.0 Submission from Leigh Dodds on 2012-06-14 (public-ldp-wg@w3.org from June 2012)

From: Leigh Dodds <ld@talis.com>
Date: Thu, 14 Jun 2012 22:37:31 +0100
To: public-ldp-wg@w3.org
Message-ID: <CAJgK0KGFWzvT4Ln84B7+SX=byrNSiLPxosM5gHR7nmsts+x6_Q@mail.gmail.com>
Hi,

I wanted to provide some feedback on the Linked Data Profile 1.0
submission to the Working Group. I'm not a member of the WG but by way
of introduction: I've been working with semweb technologies for over
10 years building RDF and Linked Data applications and APIs. This work
has been done as part of data integration behind the firewall. Most
recently I've been working at Talis helping to define our
platforms/products for Linked Data publishing and hosting.

What follows are some comments and questions that relate to the Linked
Data Profile 1.0 Submission. As I understand this document is a
starting point for the groups discussions I thought I would send to
this list rather than the authors. Apologies if I've addressed
comments to the wrong location, or misunderstood the initial goals of
the group.

I'm also not expecting any kind of official acknowledgement or
response, but offer these comments as (hopefully!) constructive
feedback and to help kick off wider discussion of the issues.

I've included section references from the Profile to aid cross-referencing.

4.1.7 and 4.1.10

These rules both require a minimum amount of data that must be
provided for a BRP, specifically at least one rdf:type and at least
one relationship to another resource. Both of these are expressed as a
MUST which I think is far too strong. I might wish to initially
capture some information about a resource, e.g. some simple literal
values, and then progressively enrich [1] that description to add type
and relationship triples. For example one common Linked Data
publishing approach is to RDFize some data and then enrich it with
additional links. These MUST requirements preclude that kind of usage.

I understand the utility of having some minimal information to allow
easier processing and navigation but think this is a SHOULD, not a
MUST.

I'd also like to suggest that every resource SHOULD have a label property [2].

4.1.9.

I don't understand why servers MUST only use these datatypes. Granted,
SPARQL only supports a subset of the XSD datatypes, but I don't think
that necessarily impacts our use of datatypes in general. There are
lots of examples of alternate XSD datatypes and custom datatypes in
use in published Linked Data & RDF, so I'm curious as to why this
extensibility should be removed.

Personally I'd prefer to see a recommended "working set" of data
types, but with guidance on the trade-offs of using additional and/or
custom types. It would be useful to survey actual usage to see if
there is already convergence on a common set.

4.1.13.

Again, I'm hesitant about the use of MUST here. I think proper use of
ETags is essential and a server should go to lengths to provide ETags.
But creating valid "deep ETags" could place additional burden on a
server.

For example I notice that the profile says nothing about what
information is provided when one de-references a BPR. If I were to
provide a Symmetric Bounded Description, e.g. to faciliate browsing,
then I will need to generate an ETag based on the state of a number of
resources.

Clearly an implementation can use coarse-grained ETags (e.g. based on
dataset modification), but that's less useful. Perhaps the Working
Group should consider some guidance on ETag generation, and the
trade-offs of not supporting them (e.g. inability to do conditional
PUT).

4.4.1

The rule doesn't note which media types must be supported for a PUT.
I'd suggest that the rule should be that a server MUST support a PUT
using any of the RDF serialisations it supports via a GET. So
application/rdf+xml and possibly text/turtle.

It might also be useful to reference use of OPTIONS requests to
advertise PUT, PATCH support; use of Accept-Patch, etc. A section on
OPTIONS might be useful to add in general.

4.4.7.

I wonder if it would be useful to consider how these additional
constraints could be advertised or discovered by clients?

4.8.

I am uneasy to see recommendations about ranges of properties that are
at odds with their official definition. Encouraging re-use is good
practice, but redefining existing properties is not.

Also, suggesting that dct:title & dct:description should only refer to
an XML Literal flies in face of common usage AFAICT.

4.8.3.

I'm confused by the recommendation for use of rdfs:label only in
vocabulary documents. rdfs:label is used very, very commonly as a
generic labelling property so already has well-deployed use outside of
vocabularies. Clearly there are alternates (e.g. skos:prefLabel) but
rdfs:label is a useful fall-back that is already understood by many
Linked Data clients.

Secondly, I don't understand the recommendation for the range of
rdfs:label to be a Resource? Is that a typo?

5.1

I don't follow this rationale for having separate container resources:

"You might wonder why we didn’t just make
http://example.org/netWorth/nw1 a container and POST the new asset
directly there. That would be a fine design if
http://example.org/netWorth/nw1 had only assets, but if it has
separate predicates for assets and liabilities, that design will not
work because it is unspecified to which predicate the POST should add
a membership triple."

Wouldn't a POST of:

<http://example.org/netWorth/nw1> o:asset
<http://example.org/netWorth/nw1/assetContainer/a3>;

....communicate the necessary information? In the case of server side
URI assignment a blank node could be used for the new asset. What am I
missing?

I'm a little confused about how to go about updating resources that
are associated with a BPC. To test my understanding and taking an
example from the specification:

<http://example.org/netWorth/nw1/assetContainer>
   a bp:Container;
   bp:membershipSubject <http://example.org/netWorth/nw1>;
   bp:membershipPredicate o:asset.

<http://example.org/netWorth/nw1>
   a o:NetWorth;
   o:asset
      <http://example.org/netWorth/nw1/assetContainer/a1>,
      <http://example.org/netWorth/nw1/assetContainer/a2>.

a. If I want to update the rdf:type of
<http://example.org/netWorth/nw1>, do I just PUT an updated
description to its URI?

b. If I want to add a new o:assert relationship for that resource,
then I must POST to <http://example.org/netWorth/nw1/assetContainer>?
If so, how do I determine that, via a SPARQL query?

c. What if I just PUT an updated description to
<http://example.org/netWorth/nw1> that adds or removes o:asset
relationships?

d. What if I first DELETE
<http://example.org/netWorth/nw1/assetContainer> and then want to
update its o:asset relationships?

e. If I want to add a dct:description property to
<http://example.org/netWorth/nw1/assetContainer> then do I PUT to
<http://example.org/netWorth/nw1/assetContainer?non-member-properties>
or can I just PUT to <http://example.org/netWorth/nw1/assetContainer>
?

5.1.2

Retrieving non member properties might be better described as another
example of a Bounded Description [3]. In this case it is a subset of a
Concise Bounded Description of a resource. A Linked Data server might
want to support clients in requesting different bounded descriptions,
beyond that of containers.

[1]. http://patterns.dataincubator.org/book/progressive-enrichment.html
[2]. http://patterns.dataincubator.org/book/label-everything.html
[3]. http://patterns.dataincubator.org/book/bounded-description.html

Cheers,

L.

-- 
Leigh Dodds
CTO, Talis Systems Ltd
Mobile: 07850 928381
http://talis.com
http://kasabi.com

Talis Systems Ltd
The Exchange
19 Newhall Street
Birmingham
B3 3PJ
Received on Friday, 15 June 2012 08:24:51 UTC