Re: FRBR and classes ('frbr:Works in the age of mechanical reproduction'...) from Karen Coyle on 2011-03-17 (public-lld@w3.org from March 2011)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Thu, 17 Mar 2011 10:02:49 -0700
To: Dan Brickley <danbri@danbri.org>
Cc: public-lld <public-lld@w3.org>
Message-ID: <20110317100249.72135ukiz389b3cp@kcoyle.net>
It's hard to respond to such a long post, but I will try to do so briefly.

Clearly, the issue here is not technology but *mission*. The mission  
of the library is not to gather physical things into an inventory, but  
to organize human knowledge that has been very inconveniently  
packaged. While there is a case for modeling the packages as packages  
(for example in warehouses that serve Amazon, or for library  
circulation functions), the library catalog describes the package as a  
secondary aspect (as you note below, Dan). The primary goal is to  
describe what the content of these packages MEANS, in themselves and  
in relation to each other, and over time. Obviously, MEANING in this  
context is a very big word.

That said, it may be time to unbundle the inventory function (which is  
necessary for library management: purchasing, circulation, estimating  
storage needs) from the human knowledge function, and at least allow  
the latter to evolve unfettered by the need to control the packages.  
Perhaps what we need to do with FRBR is to remove the dependency of  
the knowledge function from the physical inventory function, but link  
them for services that intertwine intellectual discovery and item  
delivery. (In fact, today's MARC-based records may do this better than  
FRBR does.)

kc

Quoting Dan Brickley <danbri@danbri.org>:

> (thinking-out-loud alert)
>
> So this is a conversation that resurfaces over the years in various
> ways. My latest prompt being a combination of (i) seeing
> http://www.productontology.org/ which declares OWL DL classes (ie.
> classes of thing, aka types...) for commonly named products, using
> Wikipedia data. The product ontology site uses OWL to describe classes
> of largely mass-produced thing:
>
> "This service provides GoodRelations-compatible OWL DL class
> definitions for ca. 300,000 types of product or services that have an
> entry in the English Wikipedia, e.g.
>
> http://www.productontology.org/doc/Apple
> http://www.productontology.org/doc/Laser_printer
> http://www.productontology.org/doc/Manure_spreader
> http://www.productontology.org/doc/Racing_bicycle
> http://www.productontology.org/doc/Soldering_iron
> http://www.productontology.org/doc/Sweet_potato
>
> Back at DC-2008 in Berlin someone (maybe Karen Coyle or Diane Hillman)
> mentioned that a difference between libraries and museums is that the
> works collected by the former are mass produced.
>
> I think we can go some way towards webbifying FRBR by pondering that
> observation. I spent monday and tuesday with VU.nl colleagues visiting
> the Amsterdam Museum and then the Fab Lab at http://fablab.waag.org/
> which showed some possibilities for taking museum artifacts and
> replicating lossy copies of them (with 3d printers and other
> mechanical reproduction techniques). We could even fabricate moulds
> derrived from artifacts that allow others to create new derrived
> instances (or their own moulds). Each generation derriving
> characteristics from the previous, and adding in its own flaws and
> innovations.
>
> Looking at the Product Ontology examples above, they work better at
> describing mechanically reproduced, near-identical artifacts -
> Laser_printer, Soldering_Iron than with the natural kinds of thing -
> apple, potato etc. Both apple and sweet potato are halfway to being
> mass nouns --- you might often have need to describe 'some' apple or
> sweet-potato, rather than 'a' sweet potato, although of course you can
> have a specific apple or potato in-hand. Mass production brings with
> it the prospect of thousands of *near*-identical instances of some
> type, as well as associating those with codes and lately URLs that
> link us back to information about the recipe or ingredients list for
> those types of thing. For complex modern mass produced items, if you
> know what kind of item it is, you know a huge amount about that thing
> - whether it is a book or a printer or a soldering iron.
>
> If we forget the library and cultural heritage scene for now, and
> think just about these product types: I have here in my room a
> specific laser printer. It is an HP Laser Smart C4270. Let's say it
> was bought in Leiden, Netherlands and has an owner (this household).
> It has specific characteristics local to this copy, as well as
> stereotypical characteristics that it shares with all other "HP Laser
> Smart C4270s".
>
> FRBR isn't designed to describe that kind of situation (although the
> parallels should be clear). But RDF and OWL do try to address that
> general case: RDF/RDFS/OWL is very much in the business of drawing
> such class-instance distinctions. OWL also goes some basic way towards
> providing information-machinery for stating generalisations about all
> the members of some class of thing. However OWL itself avoids certain
> complex topics that are relatively hard to avoid for us: it does not
> directly give us a way of saying '"typically". It does not give us a
> way of distinguishing intrinsic versus accidental properties. The
> latter saved W3C from retreading thousands of years of philosophical
> debate. The former is perhaps a medium-sized nuisance. Regardless:
>
> We can think of the class of things in the world that are *printers*.
> We can name that class with a URI and publish a description there.
> We can think of the class of things in the world that are *laser
> printers*. We can name that class with a URI and publish a description
> there.
> We can think of the class of things in the world that are *HP laser
> printers*. We can name that class with a URI and publish a description
> there.
> We can think of the class of things in the world that are *HP Laser
> Smart C4270 printers*. We can name that class with a URI and publish a
> description there.
>
> We can associate any thing in the world with one of more of these
> classes; in RDF by asserting an rdf:type relationship to the class. We
> can use properties associated with the class to describe the
> individual thing 'by hand', or we can draw factual conclusions about
> properties of some individual from general knowledge that makes claims
> about all members of a class.
>
> We can go deeper, towards query-like classes, and name the sub-class
> of HPLaserSmartC4270-Printer that corresponds to such printers bought
> in Leiden; or owned by me. Or that have a damaged scanner lid and
> which still serve adequately as a printer. Or which belong to the
> subclass manufactured in the UK and that shipped with a UK-compatible
> power cable.
>
> OWL doesn't impose any appropriate level of detail on us, it just
> provides descriptive primitives that let us talk in terms of [broadly]
> sets of things, the properties that characterise those sets, and the
> subset / superset relations between those sets. (We say class instead
> of set, and leave that distinction aside for now.)
>
> Computerised ontology languages like OWL are obsessed with this
> class-vs-instance distinction, and in modern mass produced life, the
> distinction is all around us, as are near-identical, mechanically
> reproduced copies of products - regardless of whether the product was
> designed to inform, educate, entertain, or remove unsightly nasal
> hair.
>
> Our FRBR-inspired conversations here are outshadowed by the need to
> make equivalent distinctions in other aspects of everyday life. From
> tracking down a replacement cable or scanner lid for my printer, to
> finding the nearest open shop that will sell me a certain kind of
> soldering iron on a sunday, or a certain DVD of a certain film, the
> desire to organize information in a way that mirrors the patterns of
> similarity amongst mass produced items is a modern universal.
>
>> From  
>> http://www.marxists.org/reference/subject/philosophy/works/ge/benjamin.htm
> http://en.wikipedia.org/wiki/The_Work_of_Art_in_the_Age_of_Mechanical_Reproduction
> and unfairly out of context,
>
> "In principle a work of art has always been reproducible. Man-made
> artifacts could always be imitated by men. Replicas were made by
> pupils in practice of their craft, by masters for diffusing their
> works, and, finally, by third parties in the pursuit of gain.
> Mechanical reproduction of a work of art, however, represents
> something new." [...] "With the woodcut graphic art became
> mechanically reproducible for the first time, long before script
> became reproducible by print. The enormous changes which printing, the
> mechanical reproduction of writing, has brought about in literature
> are a familiar story. However, within the phenomenon which we are here
> examining from the perspective of world history, print is merely a
> special, though particularly important, case."
>
> All I'm suggesting here is that we follow this advice from Walter
> Benjamin in 1936 and indulge ourselves in the idea that modeling
> bibliographic mass production is merely a special (and important)
> case.
>
> FRBR's "items" are the most concrete, tangible entities in the FRBR
> universe. In the physical realm they are things you might hold in your
> hand, put in a box, find at some location. The idea extended to the
> digital realm is naturally more ephemeral but we do at least have
> correspondingly objective characterstics that ground digital objects
> in clear ways: notions such as sizeInBytes, cryptographic hashes
> (sha1sum, md5) can be used to talk precisely about specific sequences
> of 'Zeros' and 'Ones'.
>
> Looking up the FRBR hierarchy at the more general notions of
> "Manifestation", "Expression" and "Work", these are FRBR's particular
> story for organizing our millions of items into sensible groups.
> FRBR's "work" notion is described textually as a “distinct
> intellectual or artistic creation.”... a kind of ghostly but specific
> entity, a kind of social fiction that acts as a descriptive (and
> sometimes legal) hub for organizing clusters of related items.
> "Expression" brings that somewhat down to earth (“the specific
> intellectual or artistic form that a work takes each time it is
> ‘realized.’”), while "Manifestion" finally articulates it in terms
> sets/classes rather than individual abstract entities: " “the physical
> embodiment of an expression of a work. As an entity, manifestation
> represents all the physical objects that bear the same
> characteristics, in respect to both intellectual content and physical
> form.”".
>
> So the distinctions made in terms of these *4* notions are similar to
> those baked into the core of RDF itself.... specific fairly concrete
> things organized into groups (sets, classes). RDF only allows itself
> 'rdf:type' and 'rdfs:subClassOf' relationships as a basis to describe
> all this.
>
> So if we go with this idea that "print is merely a special, though
> particularly important, case" of mass produced work, and that is it
> worth investigating RDF descriptive habits that address
> characteristics of mass production regardless of whether we are
> talking about bicycles, books, laser printers or farmyard equipment,
> ... where does this leave us? where does it get us?
>
> 1. We bring more clearly into scope some industrialised areas of
> cultural 'content' -- music, tv, films; http://musicontology.com/
> http://www.bbc.co.uk/ontologies/programmes/2009-09-07.shtml ... areas
> where FRBR is a close but not perfect fit, and class-based models
> drift towards being 'FRBR-inspired' rather than 'FRBR-based'.
>
> 2. We find OWL lacks certain conventions for distinguishing
> stereotypical instances from flawed/accidental characteristics of
> actual instances. For eg. a copy of a some book I have on my desk
> might be missing a certain page, so its literal 'number of pages'
> property couldn't be inferred from a common class shared with other
> such manifestations of the same abstraction. Or the local adjustments
> made here to my printer (I swapped the power cable, or repaired the
> lid). There is a big literature in KR about defaults and overrides and
> it's tricky to get right with open-world design of RDF/OWL/RDFS.
>
> 3. Works, Manifestations and Expressions might all just be kinds of
> classes; or annotations on classes. The class of *HP Laser Smart C4270
> printers* of which I have one in this room; the class of *SQL and
> Relational Theory books* of which I have one on my desk as I type. The
> former is described at
> http://h10025.www1.hp.com/ewfrf/wc/product?cc=us&lc=en&dlc=en&product=3300222
> by its maker;  the latter at http://oreilly.com/catalog/9780596523084
> ... more general classes might be tagged 'work-class'; very precise
> classes tagged 'manifestation-class'. But fundamentally we get a huge,
> universal spectrum (from the class of 'every Thing', to the class of
> 'No-thing') rather than forcing each into one of the FRBR 4.
>
> In both these example cases, there are product codes and online
> databases, and other people who own different instances of the same
> kind of thing. In both cases there are related products (maybe an
> ebook, maybe a successor printer design, or ink cartridge) where
> information at the level of 'all products' is useful to the owners and
> custodians of specific products.
>
> 4. OWL 2.0's punning mechanism may be relevant. This is a trick in OWL
> 2 that lets a single URI serve both as a class identifier (the class
> of C4270printers) but also as an identifier at the instance level, eg.
> something that might have other data attached like images or links to
> product documentation.
>
> 5. We would effectively be abandoning the attempt to fit the
> bibliographic universe into 4 buckets, and allowing different parties
> to name and describe classes at any level of generality, picked out by
> the properties of the things in that class. I might care to name a
> class for all books written by all former pupils of the school
> described at http://en.wikipedia.org/wiki/RGS_High_Wycombe --- this
> class would include SQL and Relational Theory, via its author,
> http://dbpedia.org/page/Christopher_J._Date  .... or you might care to
> create a class for products whose primary inventor was an immigrant.
> By stepping back from the FRBR 4, we could get a more free-form
> environment in which properties of all kinds of thing can be used to
> define whatever classes are useful.
>
> 6. What does this mean in terms of 'who defines what when' metadata
> practice? If the abstract work "SQL and Relational Theory" by C.J.Date
> is in some sense now an RDF class, what should the URI be? Who
> publishes it and what practice should exist around the associated
> online description? I don't know. Maybe authors, publishers and
> libraries all have a role, ... maybe there are 3 or more
> semi-competing URIs for that class, one from C.J.Date, one from the
> publisher O'Reilly or one or more from a library perspective. Perhaps
> one of these descriptive agencies ends up playing a hub role and
> including links to further description of the class from the other
> parties. Maybe practices vary between fields and types of product. I
> really don't know. And the core RDF/OWL specs are not the kinds of
> thing that will tell us what's best to do, btw.
>
> 7. What kinds of thing are properly expressed at the class level? I
> also don't know. We might find value in rethinking some properties to
> more explicitly attach them to the stereotypical ideal member of some
> class, as a way of admitting that not all instances will match the
> ideal. Perhaps for eg. the idea that books have 'numPages' could be
> defined to refer to the stereotypical ideal case, even while applied
> at the instance level. So if I lose 5 pages from the copy of "SQL and
> Relational Theory" on my desk, we still say it has 410 numbered pages.
> Maybe we go through and think 'which properties does it even make
> sense to mutate at the instance level?". For all the damage I could do
> to my copy of that book, I'm not going to change its author or
> subject, for example. So those would be readily expressed in terms of
> OWL. The numPages could be expressed as an OWL generalisation about
> all instances if we define that property to be the ideal number,
> rather than having to track damaged pages etc. And some properties
> such as geographic location or owner make sense only at the instance
> level. A few of these (such as e.g. initialOwner) might be static
> properties that never change their value; others vary from time to
> time.
>
>
> Ok this post is too long already. Another way of stating all this is
> that it's an appeal to think more in terms of specific
> somehow-concrete items, things. Artifacts in your hand, or computer
> data files that might be checksummed. And that all abstractions above
> those are means to an end, rather than ends in themselves. So we can
> ask whether, instead of pondering the vague characteristics of ghostly
> entities like 'works', 'expressions' and 'manifestations', whether
> we're simply talking about the common characteristics of collections
> of identifiable *items*. And if that is what we're doing, whether (a)
> we can more explicitly share common descriptive practices with other
> non-textual mass produced kinds of things (b) whether RDF/OWL might
> have some built-in facilities that could be used more (ie. its notion
> of class).
>
> This all wouldn't abolish the WEMI distinctions, rather they would as
> sketched above, show up as a kind of annotation on RDF classes. Some
> classes might be work-ish classes; the class of all Hamlets. Others
> might be manifestation-ish classes; the class of all paper-printed
> first edition SQL and Relational Theory copies. But the core
> organising idea is sets/classes rather than the ghostly upper entities
> of FRBR. Aspects of those entities would also show up as concrete
> documents; an artists first sketches of a later painting; CJ Date's
> book contract with O'Reilly that gave us the later book. First, second
> and final drafts; hp printer schematics, blueprints; architectural
> drawings; bike designs; ingredient lists and working notes. But rather
> than merge our knowledge about all those practical things into the
> vaguer composite entities of FRBR we just itemise them and describe
> them as plain old artifacts at the instance level - giving us
> something like a catalogue of evidence left in the world that shadows
> the creative process, rather than reifying the act of creation into
> special 'things' that can be described but never touched, used, read
> or consumed.
>
> Hope this all makes some sense. Related discussion from Bradley Allen,
> Karen and others:
>
> http://bpa.tumblr.com/post/10814190/faceted-classification-and-frbr
> http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg03837.html
> http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg03848.html
> http://bibwild.wordpress.com/2007/12/07/frbr-considered-as-set-relationships/
> http://lists.w3.org/Archives/Public/public-owl-dev/2008JulSep/0110.html
> http://lists.w3.org/Archives/Public/public-lld/2010Sep/0049.html
>
> cheers,
>
> Dan
>
> ps. I tried to draw some of this out graphically:
> http://www.flickr.com/photos/danbri/2891150205/  ... story of a
> t-shirt design as frbr-inspired classes
> http://www.flickr.com/photos/danbri/2892286406/in/photostream/ ...same
> story as a timeline
>
>



-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Thursday, 17 March 2011 17:03:25 UTC