InformationResources, FRBR and googling towards a literature review ( was Re: More on distinguishing information resources from other resources) from Dan Brickley on 2005-06-28 (www-tag@w3.org from June 2005)

From: Dan Brickley <danbri@w3.org>
Date: Wed, 29 Jun 2005 00:08:21 +0100
To: Dan Connolly <connolly@w3.org>
Cc: Jonathan Borden <jonathan@openhealth.org>, "'Henry S. Thompson'" <ht@inf.ed.ac.uk>, 'www-tag' <www-tag@w3.org>
Message-ID: <42C1D865.9010003@w3.org>
(sorry this is a bit long; I've been meaning to
dig up these references for a while...)

Dan Connolly wrote:

>On Tue, 2005-06-28 at 17:53 -0400, Jonathan Borden wrote:
>  
>
>>Dan,
>>
>>    
>>
>>>I think it's plain that Mark is not an information resource, 
>>>so there's something of a contradiction, or at least a potential
>>>      
>>>
>>contradiction, here.
>>
>>I expect that if we outfitted Mark with a heads up display and keyboard that
>>allowed him to view HTTP GET requests to http://www.markbaker.ca/ that he
>>would be perfectly capable as acting as a standards compliant, if not a tad
>>slow, HTTP 1.1 server. In that case he *would* in fact be an information
>>resource, no? ***
>>    
>>
>
>No. He's made of atoms, not (just) bits.
>
>I think the definition is pretty clear
>http://www.w3.org/TR/2004/REC-webarch-20041215/#def-information-resource
>
>Perhaps some of the points I made leading up to that definition
>are helpful...
>
>  
>
yes, thanks.

>[[
>Dan suggested that a a textual work can be consumed over the web in a
>way that a table cannot; if you see a table and a movie in a product
>catalog, while you can learn about the table using HTTP, you can never
>consume it to the point where you owe the vendor the price in the
>catalog, while with information resources, you can consume them to the
>point where you owe the price just by observing representations.
>]]
> -- http://www.w3.org/2001/tag/2004/10/05-07-tag#infores2
>  
>

"Consume" is a metaphor that draws on our
knowledge of eating. I guess the core idea is
of using up something that can't be replaced
with equivalent without expense? While looking
at pictures of a table doesn't use up the table
owner's resources, there are (controversially but
undeniably) a lot of sex industry Web services where
"real world" resources are "consumed" via HTTP
interactions (pay-for live video streams, etc.). Is a
(webcast) 1:1 strip-show an "information resource"?
Or, up a level, what do we stand to gain by coming up with a definition
that decides this one way or the other?

While I'm *delighted* that http-range-14 has been
defused, I'm really not yet sure that the class
"information resource" can be uncontroversially
defined without a fair bit of hard work. There's a big literature around
this distinction,
eg. see in the digital library world, the debates that
 spun out of the interaction between Dublin Core (library) and INDECS
(rights holder / publisher) metadata efforts.
The D-Lib paper at
http://www.dlib.org/dlib/january99/bearman/01bearman.html
was an effort at a hybrid view, as was the Harmony/ABC
work I was involved in with Jane Hunter and Carl Lagoze.
Different metadata communities carve these distinctions
in different ways. If the Web architecture itself is to
embody just one such conceptualisation, we should tread
warily.

A lot of these discussions appealed to variations on a
4-level "Work", "Expression", "Manifestation", "Item" distinction, based
on Tom Delsey and other's work for IFLA
on Functional Requirements for Bibliographic Retrieval
*(IFLA=International* *Federation* *of* *Library* *Associations*
<http://www.ifla.org/>).

see http://www.ifla.org/VII/s13/frbr/frbr.htm and nearby
in google('ifla','frbr') space...
http://www.ifla.org/VII/s13/frbr/frbr.pdf

If the TAG decide to pursue this task, I do recommend
that FRBR gets some serious attention, as it has a lot
of mind-share in the library and digital library world. My
understanding is that FRBR is best thought of as an
attempt to come up with a conceptual model that allows
information systems to be clear about distinctions such as
between different versions of Hamlet, different editions,
different physical books and their location in library or
who they've been lent to, as well as the larger challenge of
engaging with complex, composite, mixed-media works.

http://www.ifla.org/VII/s13/wgfrbr/wgfrbr.htm suggests
that FRBR is still an active concern, with a recent
workshop, efforts towards an entity-relationship and/or
an OO model, clarification of core concepts based on
deployment and implementation experience, and relationship to the CIDOC
CRM work from the museum world. CIDOC CRM is another, related, take on
this problem
space, has been around for some years, has an RDF/OWL
expression, and deserves some serious attention by anyone
trying to model the patterns of relationship between "information
resources" and the real world artifacts that
they represent or describe.
 
See http://cidoc.ics.forth.gr/
http://cidoc.ics.forth.gr/downloads.html

http://jodi.ecs.soton.ac.uk/Articles/v04/i01/Doerr/
 compares some ideas from Harmony/ABC with CRM. IMHO
CRM was and remains vastly more mature, but the main
point I want to make is that this is an active, and subtle,
area of discussion and debate in the wider metadata
world. And one that connects to real professional practice
(re librarians see the IFLA links, re museums, look around
near CIDOC).  For publishers/rights-holders, Godfrey Rust and co's work
in the INDECS project took the FRBR model
as a starting point, but refined it from point of view of
parties who are concerned to be paid for the modelled
content (and hence drew some careful and fine distinctions
that folk in the Dublin Core, search and resource discovery
communities were happier to gloss over). I'm not sure
where that work is now being developed, but it's probably
not far from the MPEG scene.


It's been 5 years+ since I was really involved with this
scene, but from 30 minutes googling around, it is clear
that no clear 'winner' has bubbled up from these debates.
Lots of communities have related, but different, ways of
conceptualising 'information resources' and their
relationship to various abstractions such 'works', 'ideas',
not to mention versioning, and models of real world
artifacts (eg. in a museum collection, or a print run of books,
journals etc).

Its a v interesting space, but one that'd be easy to
dissapear into, never to return. I keep coming back to
the question of "how would we know when we've got it
right?".

For further reading, Google has 1000s of hits on 'rdf'
and 'frbr', a sampling includes....

http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2004/01/24/libdb-bringing-rdf-and-the-frbr-to-the-masses
which reminds me of Morbus' work on an RDF-backed DB
inspired by FRBR,
http://www.libdb.com/
[[
LibDB allows you to /smartly and easily/ catalog your movies, books,
magazines, comics, etc. into your own computerized "personal library".
It is a *free*, *open sourced*, library and asset management system
based on and inspired by the Functional Requirements for Bibliographic
Records (pdf) <http://www.ifla.org/VII/s13/frbr/frbr.pdf>, triples from
the semantic web <http://www.disobey.com/d/2002/sw123/>, and "the
end-user doesn't, and shouldn't, need to know this stuff".
]]

http://www.libdb.com/project_goals has a nice summary
that connects FRBR to the world of RDF tools:
[[

The ideas behind FRBR are a core basis for the design of LibDB, with
specific attention to:

    * *the concepts of /work/, /expression/, /manifestation/, and
      /item/*. This involves the ability to say "I'd like to search for
      /Lord Of The Rings/, but I'm not interested in every version of
      every book published: I'm mainly interested in when it was
      written, and who it was written by. When I view this info,
      certainly give me links to its /manifestations/ (the various
      editions of the book) or related /works/ (the movies, soundtrack,
      or artwork inspired by), but don't let me see one thousand useless
      queries: I care little about the Third Edition put out by A
      Fictional Press for Devoted Members of Abbey Square Apartment 13".
      This also gives the end-user the ability to add their own comments
      about an /item/ in their possession: "This /item/ I own, which is
      a /manifestation/ of the Second Edition book, has a slightly
      coffee-stained cover and is located in Row 3, Box 17 of my Attic."
    * ]]


http://eprints.rclis.org/archive/00000205/ mentions another system,
VisualCat, also using FRBR and RDF.

I'd hope that if 'Information Resource' is to be defined,
it will turn out to be obvious an uncontroversial which classes in these
vocabs are subclasses of InformationResource. Unfortunately some of them
are
pretty abstract, so I'm not sure how that decision would
be made.

I don't really have a more concrete point, beyond 'this
stuff is hard (but kinda interesting, and related to
real use cases)". And I guess also "a lot of people have tried to make
such distinctions before, and it'd be polite to take their work into
account somehow, since they represent
important constituencies of the Web community".

cheers,

Dan
Received on Tuesday, 28 June 2005 23:08:27 UTC