Re: Comments on Linked Data Profile 1.0 Submission from Steve K Speicher on 2012-07-06 (public-ldp-wg@w3.org from July 2012)

From: Steve K Speicher <sspeiche@us.ibm.com>
Date: Fri, 6 Jul 2012 14:24:27 -0400
To: Leigh Dodds <ld@talis.com>
Cc: public-ldp-wg@w3.org
Message-ID: <OFB1E74156.8E1FE569-ON85257A33.0061463F-85257A33.00651DB2@us.ibm.com>
Leigh,

Greatly appreciate the thorough review, let me try to address cover them 
one at a time.  Some of the items will require me to  come back and others 
we may have to separate them to work through some of the clarity. It is 
the intent of the LDP WG to use the member submission as a starting point, 
as defined by the charter. 

Leigh Dodds <ld@talis.com> wrote on 06/14/2012 05:37:31 PM:

> From: Leigh Dodds <ld@talis.com>
> To: public-ldp-wg@w3.org, 
> Date: 06/15/2012 04:26 AM
> Subject: Comments on Linked Data Profile 1.0 Submission
> 
> Hi,
> 
> I wanted to provide some feedback on the Linked Data Profile 1.0
> submission to the Working Group. I'm not a member of the WG but by way
> of introduction: I've been working with semweb technologies for over
> 10 years building RDF and Linked Data applications and APIs. This work
> has been done as part of data integration behind the firewall. Most
> recently I've been working at Talis helping to define our
> platforms/products for Linked Data publishing and hosting.
> 
> What follows are some comments and questions that relate to the Linked
> Data Profile 1.0 Submission. As I understand this document is a
> starting point for the groups discussions I thought I would send to
> this list rather than the authors. Apologies if I've addressed
> comments to the wrong location, or misunderstood the initial goals of
> the group.
> 
> I'm also not expecting any kind of official acknowledgement or
> response, but offer these comments as (hopefully!) constructive
> feedback and to help kick off wider discussion of the issues.
> 
> I've included section references from the Profile to aid 
cross-referencing.
> 
> 4.1.7 and 4.1.10
> 
> These rules both require a minimum amount of data that must be
> provided for a BRP, specifically at least one rdf:type and at least
> one relationship to another resource. Both of these are expressed as a
> MUST which I think is far too strong. I might wish to initially
> capture some information about a resource, e.g. some simple literal
> values, and then progressively enrich [1] that description to add type
> and relationship triples. For example one common Linked Data
> publishing approach is to RDFize some data and then enrich it with
> additional links. These MUST requirements preclude that kind of usage.
> 
> I understand the utility of having some minimal information to allow
> easier processing and navigation but think this is a SHOULD, not a
> MUST.
> 

I believe that 4.1.10 is written in a way that is not that clear.  It is 
not to say that at least one relationship MUST be present, it is meant to 
say that when it is present, it MUST be modeled as a simple triple.
I can see the point on 4.1.7, relaxing the rdf:type requirement but feel 
that it weakens the profile a bit.

> I'd also like to suggest that every resource SHOULD have a label 
property [2].
> 

Seems like a good suggestion.

> 4.1.9.
> 
> I don't understand why servers MUST only use these datatypes. Granted,
> SPARQL only supports a subset of the XSD datatypes, but I don't think
> that necessarily impacts our use of datatypes in general. There are
> lots of examples of alternate XSD datatypes and custom datatypes in
> use in published Linked Data & RDF, so I'm curious as to why this
> extensibility should be removed.
> 
> Personally I'd prefer to see a recommended "working set" of data
> types, but with guidance on the trade-offs of using additional and/or
> custom types. It would be useful to survey actual usage to see if
> there is already convergence on a common set.
> 

I think the primary scenario is that custom datatypes just limit what 
clients can do with the values.  By limiting them, it enables a broader 
class of client applications that can read, update, compare, etc these 
values.

> 4.1.13.
> 
> Again, I'm hesitant about the use of MUST here. I think proper use of
> ETags is essential and a server should go to lengths to provide ETags.
> But creating valid "deep ETags" could place additional burden on a
> server.
> 
> For example I notice that the profile says nothing about what
> information is provided when one de-references a BPR. If I were to
> provide a Symmetric Bounded Description, e.g. to faciliate browsing,
> then I will need to generate an ETag based on the state of a number of
> resources.
> 
> Clearly an implementation can use coarse-grained ETags (e.g. based on
> dataset modification), but that's less useful. Perhaps the Working
> Group should consider some guidance on ETag generation, and the
> trade-offs of not supporting them (e.g. inability to do conditional
> PUT).
> 

Good points, we don't want to over burden servers but we want them to do 
the right thing.  Seems like a good one to spend a little more time 
investigating this, getting feedback from others in the WG, and crafting 
the right set of guidance around ETags.

> 4.4.1
> 
> The rule doesn't note which media types must be supported for a PUT.
> I'd suggest that the rule should be that a server MUST support a PUT
> using any of the RDF serialisations it supports via a GET. So
> application/rdf+xml and possibly text/turtle.
> 
> It might also be useful to reference use of OPTIONS requests to
> advertise PUT, PATCH support; use of Accept-Patch, etc. A section on
> OPTIONS might be useful to add in general.
> 

Good suggestions as well.  I wonder why OPTIONS is needed if we already 
have HEAD + Allow, what additional data would be needed in a specialized 
OPTIONS response?

> 4.4.7.
> 
> I wonder if it would be useful to consider how these additional
> constraints could be advertised or discovered by clients?
> 

You fell for our trap.  That is currently left out of the member 
submission but covered by the charter.  We started specifying something 
but decided to defer it until we got a basic set of rules in place. We 
have some high-level definition for how these constraints are specified 
[1].  It reminds me I need to update this article to make the submission.

> 4.8.
> 
> I am uneasy to see recommendations about ranges of properties that are
> at odds with their official definition. Encouraging re-use is good
> practice, but redefining existing properties is not.
> 
> Also, suggesting that dct:title & dct:description should only refer to
> an XML Literal flies in face of common usage AFAICT.
> 
> 4.8.3.
> 
> I'm confused by the recommendation for use of rdfs:label only in
> vocabulary documents. rdfs:label is used very, very commonly as a
> generic labelling property so already has well-deployed use outside of
> vocabularies. Clearly there are alternates (e.g. skos:prefLabel) but
> rdfs:label is a useful fall-back that is already understood by many
> Linked Data clients.
> 

We were trying to stay true to the intended and defined usage [2], which 
states:
"rdfs:label is an instance of rdf:Property that may be used to provide a 
human-readable version of a resource's name."
and initially it states:
"This specification describes how to use RDF to describe RDF vocabularies. 
This specification defines a vocabulary for this purpose..."

Which reads to us the if rdfs:label is used, then it is describing a the 
name of resource that is to be used in a vocabulary.

Though if general usage of rdfs:label is used in practice as just "a label 
on the resource", perhaps the specification should be updated to match 
usage.

> Secondly, I don't understand the recommendation for the range of
> rdfs:label to be a Resource? Is that a typo?
> 

It is a typo.

> 5.1
> 
> I don't follow this rationale for having separate container resources:
> 
> "You might wonder why we didn’t just make
> http://example.org/netWorth/nw1 a container and POST the new asset
> directly there. That would be a fine design if
> http://example.org/netWorth/nw1 had only assets, but if it has
> separate predicates for assets and liabilities, that design will not
> work because it is unspecified to which predicate the POST should add
> a membership triple."
> 
> Wouldn't a POST of:
> 
> <http://example.org/netWorth/nw1> o:asset
> <http://example.org/netWorth/nw1/assetContainer/a3>;
> 
> ....communicate the necessary information? In the case of server side
> URI assignment a blank node could be used for the new asset. What am I
> missing?
> 

A POST to what URL?  I also don't follow the usage of a blank node for 
this.  Perhaps we need a separate thread on this.

> I'm a little confused about how to go about updating resources that
> are associated with a BPC. To test my understanding and taking an
> example from the specification:
> 
> <http://example.org/netWorth/nw1/assetContainer>
>    a bp:Container;
>    bp:membershipSubject <http://example.org/netWorth/nw1>;
>    bp:membershipPredicate o:asset.
> 
> <http://example.org/netWorth/nw1>
>    a o:NetWorth;
>    o:asset
>       <http://example.org/netWorth/nw1/assetContainer/a1>,
>       <http://example.org/netWorth/nw1/assetContainer/a2>.
> 
> a. If I want to update the rdf:type of
> <http://example.org/netWorth/nw1>, do I just PUT an updated
> description to its URI?
> 

Yes

> b. If I want to add a new o:assert relationship for that resource,
> then I must POST to <http://example.org/netWorth/nw1/assetContainer>?
> If so, how do I determine that, via a SPARQL query?
>
Determine the container URL?  From the fact that the subject URL type is 
of bp:Container.
 
> c. What if I just PUT an updated description to
> <http://example.org/netWorth/nw1> that adds or removes o:asset
> relationships?
> 

PUTing would replace the resource defined a that Request-URI with the 
representation in the request.  So if the representation has added/removed 
relationships and the server supported PUT to update, then I believe it 
would update it as you describe.

> d. What if I first DELETE
> <http://example.org/netWorth/nw1/assetContainer> and then want to
> update its o:asset relationships?
> 

Perhaps we'd want to say more about what happens when deleting containers. 
 I would expect that most implementations that processed this request 
would also remove the relationships.

> e. If I want to add a dct:description property to
> <http://example.org/netWorth/nw1/assetContainer> then do I PUT to
> <http://example.org/netWorth/nw1/assetContainer?non-member-properties>
> or can I just PUT to <http://example.org/netWorth/nw1/assetContainer>
> ?
> 

By using <
http://example.org/netWorth/nw1/assetContainer?non-member-properties> as 
request-URI to add the dct:descriptions allows you do it without having to 
preserve the members on update.

> 5.1.2
> 
> Retrieving non member properties might be better described as another
> example of a Bounded Description [3]. In this case it is a subset of a
> Concise Bounded Description of a resource. A Linked Data server might
> want to support clients in requesting different bounded descriptions,
> beyond that of containers.
> 

Been a while since I looked at this.  We looked at a number of approaches 
to do this and I should probably surface this in the discussion.  We can 
include the Bounded Description with that evaluation.

> [1]. http://patterns.dataincubator.org/book/progressive-enrichment.html

> [2]. http://patterns.dataincubator.org/book/label-everything.html

> [3]. http://patterns.dataincubator.org/book/bounded-description.html

> 
> -- 
> Leigh Dodds
> CTO, Talis Systems Ltd


[1] - 
http://www.ibm.com/developerworks/rational/library/basic-profile-linked-data/#6.Basic%20Profile%20validation%20and%20constraints|outline

[2] - http://www.w3.org/TR/rdf-schema/#ch_label


Thanks,
Steve Speicher | IBM Rational Software | (919) 254-0645
Received on Friday, 6 July 2012 18:25:30 UTC