Fwd: W3C Library Linked Data XG - request for review from Antoine Isaac on 2011-07-30 (public-lld@w3.org from July 2011)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Sat, 30 Jul 2011 10:11:36 +0200
To: public-lld <public-lld@w3.org>, jar@creativecommons.org
Message-ID: <4E33BCB8.3070406@few.vu.nl>
Some additional comments on the report.

Thanks a lot for this, Jonathan, that's much appreciated. We'll come back to you on this in the coming weeks!

Antoine


---------- Forwarded message ----------
From: Jonathan Rees <jar@creativecommons.org>
Date: Fri, Jul 29, 2011 at 4:10 PM
Subject: Re: W3C Library Linked Data XG - request for review
To: Thomas Baker <tom@tombaker.org>

Some thoughts on the report (which you can relay)

- It's not clear who the audience is (I realize this was a TBD). It
reads a bit as if it's the XG talking to those who are already sold on
the idea of LD for libraries. The report *I* would like to see is more
ambitious - it would be written for conservative library IT directors
who may be skeptical of linked data.

- The 'scope' section is not related to benefits so should be put separate

- Every benefit of *any* technology comes with some cost. So the
'benefits' section IMO ought to be a 'value proposition' section -
what benefits do you get, at what cost. I know this is covered later,
so maybe 'benefits' should be shortened and turned into an
introduction, with a promise of more depth later on.

- The benefits section is too gung-ho for my taste, e.g. the use of
the word 'significant' with no justification. It also does not provide
a comparison or null model. If you do LD you're not doing something
else. Are the alternatives (including doing nothing, using XML or jSON
or SOAP, etc) just as good? Why not?

- You could do some of the cheerleading by reference to other
documents that promote LD, thus saving space here

- The whole social issue is ignored in the benefits section. Using
someone else's linked data is a liability (cost): they might go
offline, or they might decide to change the format. Unless you have a
contract of some kind there is no protection against these. Similarly,
if you make a promise to the community around uptime, updates, or
stability, that is a liability to your organization. So LD in the long
run, as part of infrastructure design, can have a huge cost due to its
social fragility. To recover this cost requires fame or thank-you
notes or citations that you can bring to your trustees, or fee for
service.

Cost of coordination is also ignored. Sure, one can unilaterally do
RDF design and publish, but this can lead to lost opportunities if
it's not exactly what a partner needs, or if it's incompatible with
information coming from a similar but independent source. Without
coordination, RDF really gives little benefit over XML.

These are things that library administrators need to know. By being
skeptical and transparent yourself, you'll gain credibility.

On the other hand LD for one-off projects is a great thing.

- what does 'a.o.' stand for?

- 'an sign'

- Section on bulk access should mention the value of doing joins,
which only work well when you've done a bulk load into some kind of
query engine (triple store, etc).

- In 'front ends' where XSLT is mentioned you might want to mention
GRDDL (although I don't know whether it's used)

- I'm not sure the microdata discussion is in scope, esp. given that
microdata is nowhere close to Rec status... there's nothing here
specific to libraries

- need reference for 'resource oriented architecture'

- Is Drupal your only example of a CMS?  Would be nice to at least
mention a 2nd one.

- Re 'web services for LLD', why would you *want* to refactor API
capabilities using the LD stack?  You're assuming too much knowledge
on the part of the reader

- The whole section 'implementation challenges and barriers to
adoption' comes off as a criticism of library culture: "resists
change"... "out of step"... "understaffed"... "do not adapt".  This is
unnecessary and seems like biting the hand that feeds you. It's
possibly even alienating and counterproductive. Yes, we know that
libraries are endangered, and that this is partly because they haven't
found their place on the web, partly because they don't innovate, and
so on. But libraries have good reasons to be conservative and these
have to be respected. You're proposing an innovation that *has a cost*
that you don't acknowledge. You have to show that compared to the null
hypothesis (status quo) the benefit will outweigh the cost.

So please rephrase in positive terms. LD "can reduce the cost of
innovation"... "create new economies"... "help make better use of
scarce staff"... "help libraries take advantage of new technological
opportunities"... and so on. This section should again be about costs
and benefits: what will libraries need to do, *if* they choose to go
down this path. They would need to broaden the set of vendors they
work with, train staff, interact with other communities, etc. Costs +
benefits in each case. Overall, the goal is to help to enable informed
decisions.

If LLD is worthwhile, then ways will be found to use it. If not, then
it *shouldn't* be adopted.

How this is written depends of course on the audience. If the audience
is people who have already decided they want to do LD, and the idea is
to help them push it through their organizations, that's a different
pitch. But as I said what I think you want is a document aimed at the
skeptical but open-minded reader.

- 'web communities' - I don't think there's any such thing; please specify

- The discussion of bottom-up standards needs some explanation.
Definition and examples. I don't really understand why you'd say HTML5
isn't bottom-up; most of what's in it has originated with some single
browser ("bottom") and then gone "up". I think you're trying to say
that bottom-up is the norm for RDF (with a few exceptions such as
RDFS), for some definition of bottom-up.

- "Library standards are limited" - examples would help people like me
who are not immersed.

- ROI - you're trying to talk about costs of the status quo separately
from its benefit and from the costs and benefits of LD. I think this
will in the end mean you are talking about the same issues in multiple
separated sections of the document. I would prefer a structure more
like the following: for each issue, explain what problem is to be
solved, how it is solved by the status quo (how successfully and at
what cost), and how it is solved using linked data (how successfully
and at what cost).

- Re data rights issues, these are important enough that I think they
should be summarized in your report, even if you have a citation to a
more comprehensive document. (Factual information not protected in US,
sui generis database rights in Europe, etc.)

- "Cultivate an ethos of innovation" - again you're somehow assuming
that an organization's scarce innovation dollars should go to LD
instead of to something else. That this argument has to be made,
should be admitted.

- "Assign unique identifiers" - you're glossing the issue of cost and
responsibility here. We tried to address this in life sciences with
the shared names project, which has yet to kick in due to lack of
attention and funding. This is really hard because maintenance of URIs
in perpetuity is both difficult to understand and a hot potato.

- The problem of domain name loss and/or loss of service for a domain
name is touched on, and that's good. Backup copies are great, but then
tools need to be able to get at them after the primary is lost. It's
probably too early to specify just how this should happen (XML
catalogs? Memento?) but the problem needs to be acknowledged as
something we'll have to face (and pay for) in the future.

Jonathan


On Tue, Jun 7, 2011 at 4:28 AM, Thomas Baker <tom@tombaker.org> wrote:
> Dear Jonathan,
>
> We are contacting you as a member of the W3C Library Linked Data Incubator
> Group (LLD XG) to request your help in reviewing the draft deliverables of the
> XG, which need to be published in final form by the end of August.
>
> Specifically, we would appreciate if you could read a three-page section of the
> report called "Benefits of the Linked Data Approach" [1].  Getting the
> arguments in this section of the report right will be key to the success of the
> deliverable as a whole.  You can see the section in the context of the draft
> report as a whole at [2].
>
> It would be very helpful to the work of the group if you could post comments to
> the public-lld mailing list [3] -- if possible by Monday, 20 June.  If you are
> able to do this, please drop us a line so that we can plan accordingly (or find
> an alternative reviewer).  If you have additional or alternative reviewers to
> suggest, please let us know!
>
> We want to ensure that the deliverables reflect the rough consensus of the XG
> members as a whole, so any input you could offer at this stage would be most
> valuable.
>
> Many thanks and best regards,
>
> Tom Baker, Antoine Isaac, Emmanuelle Bermes
> Co-Chairs, W3C Library Linked Data Incubator Group
>
> [1] http://www.w3.org/2005/Incubator/lld/wiki/Benefits#Benefits_of_the_Linked_Data_approach
> [2] http://www.w3.org/2005/Incubator/lld/wiki/DraftReportWithTransclusion#Benefits
> [3] http://lists.w3.org/Archives/Public/public-lld/
>
> --
> Tom Baker <tom@tombaker.org>
>
Received on Saturday, 30 July 2011 08:10:40 UTC