RE: Brainstorming: Key Issues from Young,Jeff (OR) on 2011-02-28 (public-xg-lld@w3.org from February 2011)

From: Young,Jeff (OR) <jyoung@oclc.org>
Date: Sun, 27 Feb 2011 23:37:19 -0500
To: <gordon@gordondunsire.com>, "Emmanuelle Bermes" <manue@figoblog.org>
Cc: <public-xg-lld@w3.org>, "Ed Summers" <ehs@pobox.com>
Message-ID: <52E301F960B30049ADEFBCCF1CCAEF590B979E92@OAEXCH4SERVER.oa.oclc.org>
This raises a few thoughts in my mind:

 

1)      I’m a little uncomfortable with the wording in Gordon’s 1st sentence. As an API, HTTP doesn’t provide create, read, update, and delete (CRUD) operations for individual triples, but it does provide those mechanisms for individual resources. This implies that a CBD representation rather than individual triples will be the most natural unit of maintenance on the Web (at least for the foreseeable future).

2)      I think the various levels of description Gordon mentions can still be relevant in Linked Data, but a resource-oriented perspective implies they should map to separate Web document URIs. In theory, each URI could support CRUD operations relevant for its designated level of detail. This raises the question of which *one* of these Web document representations would be “best” associated to the real world object URI in a “Cool URIs for the Semantic Web” way. One solution would be for each Web document to have its own real world object URI that gets reconciled to the others using owl:sameAs. We tend to believe that alias identifiers are evil, but maybe it would be better to believe they are an opportunity for added functionality.

3)      I wouldn’t be disappointed if an HTTP representation of a Linked Data resource delivered more than a minimal CBD representation. This extended information *could* be constrained by an Application Profile (AP), but I’m not convinced it’s important. Downloading complete datasets one-HTTP-request-at-a-time isn’t scalable for large datasets. If the RDF dataset is available in a bulk form, what’s the harm in delivering an HTTP representation that extends beyond CBD? For example, I can imagine an HTTP response for a single FRBR work that including LCSH prefLabels for associated LCSH terms, but these can and should be stripped out of the bulk distribution to avoid undesirable side-effects.

 

Jeff

 

From: gordon@gordondunsire.com [mailto:gordon@gordondunsire.com] 
Sent: Sunday, February 27, 2011 6:15 AM
To: Emmanuelle Bermes; Young,Jeff (OR)
Cc: public-xg-lld@w3.org; Ed Summers
Subject: Re: Brainstorming: Key Issues

 

All

 

I think the focus of library metadata maintenance will shift from record to triple (set of statements about a bibliographic entity to a single statement/triple with that entity as subject).

 

But while a triple is great for linking linked data, an isolated triple is not that useful for consumption by human agents. Libraries and their users are somewhat familiar with the idea of "levels of description", equivalent to sets of descriptive attributes that increase in size/coverage. AACR2 notes three such levels, with some indication of which is appropriate for different kinds of libraries. For example, the third level of description includes all attributes relevant to the resource described. This level is used by national agencies, national bibliographies, etc. The first level of description only includes a basic sub-set of all possible attributes, and is suitable for brief record displays, etc.

 

Furthermore, library catalogues currently display different sets of attributes at different points of the resource discovery process: the so-called author/title list as the result of a search, then the "standard" record display for a selected resource, often with an option to display the "full" record if the user wants to see it.

 

So there never has been a fixed "record" in Libraryland.

 

What is required is guidance/profiles/etc. for assembling different sets of triples with the same subject for specific purposes. Most of the cataloguing standards in wide-spread use provide some form of guidance (and profiles in terms of attribute sets). A set of triples would, of course, be specified by the predicates. And presumably the objects would be "de-referenced" to literal values.

 

Application Profiles are probably relevant; ISBD is developing an AP which lays out the sequence of properties, aggregations of sub-sets, mandatory status, and repeatability status. Different APs can be developed for circumstances where mandatory status, etc. changes in different contexts.

 

The proposal for Concise Bounded Description (CBD) [1]is also relevant, I think, although it is intended for consumption by semantic/software agents. See the use case on Migrating library legacy data [2] for a very brief discussion.

 

I agree with Jeff, in the sense that there will be many, many more AP/CBD profiles than current record "formats", and they will be referred to by a range of terms and not just "record" (only the triple will be a fixed conceptual unit). I also think Emmanuelle's comments are very important - human agents will always conflate surrogate with real-world object.

 

Cheers

 

Gordon

 

[1] http://www.w3.org/Submission/CBD/


[2] http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Migrating_Library_Legacy_Data


 

 

 

On 25 February 2011 at 07:46 Emmanuelle Bermes <manue@figoblog.org> wrote:

> I would like to tell a story about that "surrogate" term (Antoine may
> recall too...)
>
> A few years ago, when we were first brainstorming about Europeana, we
> decidedly stated that Europeana would not be another library
> catalogue, nor another portal. We wanted to do something "more" with
> the data, we wanted to be able to align our descriptions of objects
> (which wouldn't be records) with a semantic layer describing "real
> things" : works, creators, events, etc.
> (All this may seem really familar to you all, but that was 4 years
> ago, and quite new at the time.)
>
> So, we came up with the idea of "surrogate". The surrogate was
> something that was meant to express that Europeana was not hosting
> digital objects themselves, but a representation of them, and this
> representation had to be something more than just a record.
>
> 2 years later, the term surrogate failed and we gave it up. Why ?
> - because the surrogate was initially meant to be conceptual, but
> people kept trying to instantiate it and name it in the data, which
> led to confusion
> - because "surrogate" is a term that has no satisfying translation in
> some languages (including french) and thus corresponds to no
> ready-made reality for (at least some) non-native english speakers
> Maybe there were other reasons that I don't remember.
>
> I know that the world has changed a lot in the meantime, now we have
> Linked Data, and a great deal of thoughts on resources and their
> representations ([1] and its great summary at [2] ;-). But If we are
> to choose "the" word that will make the shift from the record to the
> graph, I would avoid surrogate.
>
> Emmanuelle
>
> [1] http://www.w3.org/TR/cooluris/#distinguishing

> [2] http://q6.oclc.org/2009/03/linked_data_a_l.html

>
>
>
> On Thu, Feb 24, 2011 at 5:39 PM, Young,Jeff (OR) <jyoung@oclc.org> wrote:
> > I think our notion of "surrogate" is destined to change from "record" to
> > "concept". I suspect it will be a quiet revolution analogous to how our
> > notion of LCCN changed over the years from "card number" to "control
> > number" and now (for all intents and purposes) to "concept number".
> >
> > Jeff
> >
> >> -----Original Message-----
> >> From: public-xg-lld-request@w3.org [mailto:public-xg-lld-
> >> request@w3.org] On Behalf Of Ed Summers
> >> Sent: Thursday, February 24, 2011 10:25 AM
> >> To: public-xg-lld@w3.org
> >> Subject: Re: Brainstorming: Key Issues
> >>
> >> There has been some really good content in this thread so far. I
> >> really liked the point that Antoine and Jeff identified regarding what
> >> pre-web libraries have traditionally called "surrogates" and the need
> >> for such a notion on the web--in particular in the Linked Data space.
> >> It is an extremely important point which will largely effect how well
> >> library data will fit in with the Linked Data community, and the Web
> >> in general.
> >>
> >> I think this very specific point ripples out quite a bit, into how
> >> vocabularies are used to describe library materials. Perhaps it is too
> >> ambitious but I would like the final report to make recommendations
> >> about what vocabularies are useful for making library linked data
> >> available, and to identify places where new vocabulary is needed.
> >>
> >> Kevin and Emmanuelle's point about needing to come up with a
> >> compelling elevator pitch is also extremely important. I would like to
> >> see some pretty clear language in the report describing a) why library
> >> system developers might want to consider using Linked Data, and b) why
> >> library professionals should make Linked Data support a requirement
> >> when purchasing or developing systems.
> >>
> >> //Ed
> >>
> >
> >
> >
> >
>
Received on Monday, 28 February 2011 04:38:19 UTC