Re: Comments on Linked Data Profile 1.0 Submission from Leigh Dodds on 2012-07-11 (public-ldp-wg@w3.org from July 2012)

From: Leigh Dodds <ld@talis.com>
Date: Wed, 11 Jul 2012 13:22:35 +0100
To: Steve K Speicher <sspeiche@us.ibm.com>
Cc: public-ldp-wg@w3.org
Message-ID: <CAJgK0KG1-2rBCa5KzAudqk-uCHs5JnD_kVVZrBB73Z9-zqLPyA@mail.gmail.com>
Hi Steve,

Sorry for slow follow-up to this. Glad the feedback is useful. A few
comments in-line:

On 6 July 2012 19:24, Steve K Speicher <sspeiche@us.ibm.com> wrote:
> ...
> Leigh Dodds <ld@talis.com> wrote on 06/14/2012 05:37:31 PM:
> ...
>> 4.1.7 and 4.1.10
>>
>> These rules both require a minimum amount of data that must be
>> provided for a BRP, specifically at least one rdf:type and at least
>> one relationship to another resource. Both of these are expressed as a
>> MUST which I think is far too strong. I might wish to initially
>> capture some information about a resource, e.g. some simple literal
>> values, and then progressively enrich [1] that description to add type
>> and relationship triples. For example one common Linked Data
>> publishing approach is to RDFize some data and then enrich it with
>> additional links. These MUST requirements preclude that kind of usage.
>>
>> I understand the utility of having some minimal information to allow
>> easier processing and navigation but think this is a SHOULD, not a
>> MUST.
>>
>
> I believe that 4.1.10 is written in a way that is not that clear.  It is
> not to say that at least one relationship MUST be present, it is meant to
> say that when it is present, it MUST be modeled as a simple triple.
> I can see the point on 4.1.7, relaxing the rdf:type requirement but feel
> that it weakens the profile a bit.

It'd be good to get that wording cleared up. It was confusing for me at least :)

I take your point about the rdf:type requirement. I'm thinking about
scenarios where I'm steadily building up a dataset and may not have
all of the data to hand initially. But this may not be a common
scenario.

>> I'd also like to suggest that every resource SHOULD have a label
> property [2].
>>
>
> Seems like a good suggestion.

Thanks.

>> 4.1.9.
>>
>> I don't understand why servers MUST only use these datatypes. Granted,
>> SPARQL only supports a subset of the XSD datatypes, but I don't think
>> that necessarily impacts our use of datatypes in general. There are
>> lots of examples of alternate XSD datatypes and custom datatypes in
>> use in published Linked Data & RDF, so I'm curious as to why this
>> extensibility should be removed.
>>
>> Personally I'd prefer to see a recommended "working set" of data
>> types, but with guidance on the trade-offs of using additional and/or
>> custom types. It would be useful to survey actual usage to see if
>> there is already convergence on a common set.
>>
>
> I think the primary scenario is that custom datatypes just limit what
> clients can do with the values.  By limiting them, it enables a broader
> class of client applications that can read, update, compare, etc these
> values.

I understand the general aim, as clients do have more chance of
working with data if they can understand it. One might argue that this
applies just as well to schema terms as well as datatypes. For schema
you've encouraged some best practices and convergence on standard
terms. The same approach could be applied to datatypes.

There's a matter of degree here too. Truly custom datatypes are
unlikely to be interoperable: there still isn't a well defined recipe
for defining them. However the XML schema datatypes are all
well-defined, if not always widely supported. As I pointed out, I
think more of them are in common use than the subset recommended in
the profile.

In this kind of standardisation effort I think its worth surveying
usage to determine current practice and deciding the best route
forward.

The profile current starts from a very prescriptive position, but I'm
not convinced that's necessary. An alternative approach would be to
identify a common set of datatypes to encourage further convergence,
with explanations of trade-offs.

>
>> 4.1.13.
>>
>> Again, I'm hesitant about the use of MUST here. I think proper use of
>> ETags is essential and a server should go to lengths to provide ETags.
>> But creating valid "deep ETags" could place additional burden on a
>> server.
>>
>> For example I notice that the profile says nothing about what
>> information is provided when one de-references a BPR. If I were to
>> provide a Symmetric Bounded Description, e.g. to faciliate browsing,
>> then I will need to generate an ETag based on the state of a number of
>> resources.
>>
>> Clearly an implementation can use coarse-grained ETags (e.g. based on
>> dataset modification), but that's less useful. Perhaps the Working
>> Group should consider some guidance on ETag generation, and the
>> trade-offs of not supporting them (e.g. inability to do conditional
>> PUT).
>>
>
> Good points, we don't want to over burden servers but we want them to do
> the right thing.  Seems like a good one to spend a little more time
> investigating this, getting feedback from others in the WG, and crafting
> the right set of guidance around ETags.

OK, great.

>> 4.4.1
>>
>> The rule doesn't note which media types must be supported for a PUT.
>> I'd suggest that the rule should be that a server MUST support a PUT
>> using any of the RDF serialisations it supports via a GET. So
>> application/rdf+xml and possibly text/turtle.
>>
>> It might also be useful to reference use of OPTIONS requests to
>> advertise PUT, PATCH support; use of Accept-Patch, etc. A section on
>> OPTIONS might be useful to add in general.
>>
>
> Good suggestions as well.  I wonder why OPTIONS is needed if we already
> have HEAD + Allow, what additional data would be needed in a specialized
> OPTIONS response?

Yes, I think support for OPTIONS is needed even if HEAD + Allow is
possible. A generic HTTP client may well issue an OPTIONS request, so
ensuring that is implemented by servers is a good thing, IMO.

One deployed use of OPTIONS is in CORS. User agents may issue
pre-flight requests using OPTIONS. This would require servers to
return additional CORS specific headers.

>> 4.4.7.
>>
>> I wonder if it would be useful to consider how these additional
>> constraints could be advertised or discovered by clients?
>>
>
> You fell for our trap.  That is currently left out of the member
> submission but covered by the charter.  We started specifying something
> but decided to defer it until we got a basic set of rules in place. We
> have some high-level definition for how these constraints are specified
> [1].  It reminds me I need to update this article to make the submission.

Heh. Will look forward to seeing that.

>> 4.8.
>>
>> I am uneasy to see recommendations about ranges of properties that are
>> at odds with their official definition. Encouraging re-use is good
>> practice, but redefining existing properties is not.
>>
>> Also, suggesting that dct:title & dct:description should only refer to
>> an XML Literal flies in face of common usage AFAICT.
>>
>> 4.8.3.
>>
>> I'm confused by the recommendation for use of rdfs:label only in
>> vocabulary documents. rdfs:label is used very, very commonly as a
>> generic labelling property so already has well-deployed use outside of
>> vocabularies. Clearly there are alternates (e.g. skos:prefLabel) but
>> rdfs:label is a useful fall-back that is already understood by many
>> Linked Data clients.
>>
>
> We were trying to stay true to the intended and defined usage [2], which
> states:
> "rdfs:label is an instance of rdf:Property that may be used to provide a
> human-readable version of a resource's name."
> and initially it states:
> "This specification describes how to use RDF to describe RDF vocabularies.
> This specification defines a vocabulary for this purpose..."
>
> Which reads to us the if rdfs:label is used, then it is describing a the
> name of resource that is to be used in a vocabulary.
>
> Though if general usage of rdfs:label is used in practice as just "a label
> on the resource", perhaps the specification should be updated to match
> usage.

Yes, rdfs:label is widely used as "just a label". See:

http://www.aifb.kit.edu/images/c/c0/LabelsInTheWebOfData.pdf

Usage has evolved from the definition. I agree that the specification
ought to be updated. One for the RDF WG group I think.

However the re-definition of dc:title & description still doesn't seem correct.

>> Secondly, I don't understand the recommendation for the range of
>> rdfs:label to be a Resource? Is that a typo?
>>
>
> It is a typo.
>
>> 5.1
>>
>> I don't follow this rationale for having separate container resources:
>>
>> "You might wonder why we didn’t just make
>> http://example.org/netWorth/nw1 a container and POST the new asset
>> directly there. That would be a fine design if
>> http://example.org/netWorth/nw1 had only assets, but if it has
>> separate predicates for assets and liabilities, that design will not
>> work because it is unspecified to which predicate the POST should add
>> a membership triple."
>>
>> Wouldn't a POST of:
>>
>> <http://example.org/netWorth/nw1> o:asset
>> <http://example.org/netWorth/nw1/assetContainer/a3>;
>>
>> ....communicate the necessary information? In the case of server side
>> URI assignment a blank node could be used for the new asset. What am I
>> missing?
>>
>
> A POST to what URL?  I also don't follow the usage of a blank node for
> this.  Perhaps we need a separate thread on this.

OK. I meant a POST to: http://example.org/netWorth/nw1

>> I'm a little confused about how to go about updating resources that
>> are associated with a BPC. To test my understanding and taking an
>> example from the specification:
>>
>> <http://example.org/netWorth/nw1/assetContainer>
>>    a bp:Container;
>>    bp:membershipSubject <http://example.org/netWorth/nw1>;
>>    bp:membershipPredicate o:asset.
>>
>> <http://example.org/netWorth/nw1>
>>    a o:NetWorth;
>>    o:asset
>>       <http://example.org/netWorth/nw1/assetContainer/a1>,
>>       <http://example.org/netWorth/nw1/assetContainer/a2>.
>>
>> a. If I want to update the rdf:type of
>> <http://example.org/netWorth/nw1>, do I just PUT an updated
>> description to its URI?
>>
>
> Yes

OK.

>> b. If I want to add a new o:assert relationship for that resource,
>> then I must POST to <http://example.org/netWorth/nw1/assetContainer>?
>> If so, how do I determine that, via a SPARQL query?
>>
> Determine the container URL?  From the fact that the subject URL type is
> of bp:Container.

Yes, how do I determine the container URL. For example, if I have a
description of http://example.org/netWorth/nw1, e.g. by de-referencing
its URI, then I don't know what the container is, as there is no
relationship from the resource to its container.

I would need some other method to determine that relationship and find
the container URL. A POST to the resource URL would be simpler.

>> c. What if I just PUT an updated description to
>> <http://example.org/netWorth/nw1> that adds or removes o:asset
>> relationships?
>>
>
> PUTing would replace the resource defined a that Request-URI with the
> representation in the request.  So if the representation has added/removed
> relationships and the server supported PUT to update, then I believe it
> would update it as you describe.

OK. If I can do that, then I'm really not understanding the role of
the container resources :/

>> d. What if I first DELETE
>> <http://example.org/netWorth/nw1/assetContainer> and then want to
>> update its o:asset relationships?
>>
>
> Perhaps we'd want to say more about what happens when deleting containers.
>  I would expect that most implementations that processed this request
> would also remove the relationships.

OK.

>> e. If I want to add a dct:description property to
>> <http://example.org/netWorth/nw1/assetContainer> then do I PUT to
>> <http://example.org/netWorth/nw1/assetContainer?non-member-properties>
>> or can I just PUT to <http://example.org/netWorth/nw1/assetContainer>
>> ?
>>
>
> By using <
> http://example.org/netWorth/nw1/assetContainer?non-member-properties> as
> request-URI to add the dct:descriptions allows you do it without having to
> preserve the members on update.
>
>> 5.1.2
>>
>> Retrieving non member properties might be better described as another
>> example of a Bounded Description [3]. In this case it is a subset of a
>> Concise Bounded Description of a resource. A Linked Data server might
>> want to support clients in requesting different bounded descriptions,
>> beyond that of containers.
>>
>
> Been a while since I looked at this.  We looked at a number of approaches
> to do this and I should probably surface this in the discussion.  We can
> include the Bounded Description with that evaluation.

That would be useful, thanks.

L.

-- 
Leigh Dodds
CTO, Talis Systems Ltd
Mobile: 07850 928381
http://talis.com
http://kasabi.com

Talis Systems Ltd
The Exchange
19 Newhall Street
Birmingham
B3 3PJ
Received on Wednesday, 11 July 2012 12:23:08 UTC