- From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- Date: Thu, 17 Mar 2011 20:12:03 +0100
- To: public-lld@w3.org, Dan Brickley <danbri@danbri.org>
- Cc: goodrelations <goodrelations@ebusiness-unibw.org>
Hi Dan, Thanks for the interesting points, which I found only by accident. Just FYI: Both http://www.productontology.org and GoodRelations (http://purl.org/goodrelations) try to cater for the need of modeling classes, individuals, and prototypes in an OWL DL world. This happens by using a) three subclasses of gr:ProductOrService: - one for actual, identifiable individuals (gr:ActualProductOrServiceInstance) - one for bags of anonymous individuals (gr:ProductOrServicesSomeInstancesPlaceholder) - one for prototypes that define default attributes of individuals (gr:ProductOrServiceModel) b) a link between the prototypes and individuals (gr:hasMakeAndModel) So you can use common abstractions (e.g. pto:Laser_printer) for all of them while being able to keep them apart when needed by using the intersection of one of the three classes with the respective generic abstraction: foo:MyLaserPrinter a pto:Laser_printer, gr:ActualProductOrServiceInstance ; gr:serialNumber "123445X233"^^xsd:string . foo:ProtypeForALaserprinter a pto:Laser_printer, gr:ProductOrServiceModel ; color "black" . Now, if we know that foo:ProtypeForALaserprinter is the prototype for foo:MyLaserPrinter via foo:MyLaserPrinter gr:hasMakeAndModel foo:ProtypeForALaserprinter . we can infer that it will have the color "black", unless we have information about the specific color of foo:MyLaserPrinter. The actual reasoning for the defaults is beyond OWL DL, but can be well modeled e.g. in SPARQL CONSTRUCT rules: (from http://www.ebusiness-unibw.org/wiki/GoodRelationsOptionalAxiomsAndLinks#Product_Models): # Products inherit all product features from their product models unless they are defined for the products individually CONSTRUCT {?product ?property ?valueModel.} WHERE { { {?product a gr:ActualProductOrServiceInstance.} UNION {?product a gr:ProductOrServicesSomeInstancesPlaceholder.} } ?model a gr:ProductOrServiceModel. ?product gr:hasMakeAndModel ?model. ?model ?property ?valueModel. { {?property rdfs:subPropertyOf gr:qualitativeProductOrServiceProperty.} UNION {?property rdfs:subPropertyOf gr:quantitativeProductOrServiceProperty.} UNION {?property rdfs:subPropertyOf gr:datatypeProductOrServiceProperty.} } OPTIONAL {?product ?property ?valueProduct.} FILTER (!bound(?valueProduct)) } So if foo:MyLaserPrinter does not have a gr:color property, the rule will add foo:MyLaserPrinter gr:color "black" . Otherwise, it will preserve the local value. Quite clearly, an RDF dataspace can only do this once the set of relevant triples is defined; no true OWA. While this was initially designed for the pretty narrow case of commodities and their datasheets, it can also be used for other pairs of individuals and prototypes, e.g. a composition and its performance (think of the duration of a piece of piano music). Also, the common abstraction for both may differ (e.g. the prototype could be a foo:Oil_painting and the individual a foo:Picture) or be pretty broad (e.g. foo:DVD). Martin (thinking-out-loud alert) So this is a conversation that resurfaces over the years in various ways. My latest prompt being a combination of (i) seeing http://www.productontology.org/ which declares OWL DL classes (ie. classes of thing, aka types...) for commonly named products, using Wikipedia data. The product ontology site uses OWL to describe classes of largely mass-produced thing: "This service provides GoodRelations-compatible OWL DL class definitions for ca. 300,000 types of product or services that have an entry in the English Wikipedia, e.g. http://www.productontology.org/doc/Apple http://www.productontology.org/doc/Laser_printer http://www.productontology.org/doc/Manure_spreader http://www.productontology.org/doc/Racing_bicycle http://www.productontology.org/doc/Soldering_iron http://www.productontology.org/doc/Sweet_potato Back at DC-2008 in Berlin someone (maybe Karen Coyle or Diane Hillman) mentioned that a difference between libraries and museums is that the works collected by the former are mass produced. I think we can go some way towards webbifying FRBR by pondering that observation. I spent monday and tuesday with VU.nl colleagues visiting the Amsterdam Museum and then the Fab Lab at http://fablab.waag.org/ which showed some possibilities for taking museum artifacts and replicating lossy copies of them (with 3d printers and other mechanical reproduction techniques). We could even fabricate moulds derrived from artifacts that allow others to create new derrived instances (or their own moulds). Each generation derriving characteristics from the previous, and adding in its own flaws and innovations. Looking at the Product Ontology examples above, they work better at describing mechanically reproduced, near-identical artifacts - Laser_printer, Soldering_Iron than with the natural kinds of thing - apple, potato etc. Both apple and sweet potato are halfway to being mass nouns --- you might often have need to describe 'some' apple or sweet-potato, rather than 'a' sweet potato, although of course you can have a specific apple or potato in-hand. Mass production brings with it the prospect of thousands of *near*-identical instances of some type, as well as associating those with codes and lately URLs that link us back to information about the recipe or ingredients list for those types of thing. For complex modern mass produced items, if you know what kind of item it is, you know a huge amount about that thing - whether it is a book or a printer or a soldering iron. If we forget the library and cultural heritage scene for now, and think just about these product types: I have here in my room a specific laser printer. It is an HP Laser Smart C4270. Let's say it was bought in Leiden, Netherlands and has an owner (this household). It has specific characteristics local to this copy, as well as stereotypical characteristics that it shares with all other "HP Laser Smart C4270s". FRBR isn't designed to describe that kind of situation (although the parallels should be clear). But RDF and OWL do try to address that general case: RDF/RDFS/OWL is very much in the business of drawing such class-instance distinctions. OWL also goes some basic way towards providing information-machinery for stating generalisations about all the members of some class of thing. However OWL itself avoids certain complex topics that are relatively hard to avoid for us: it does not directly give us a way of saying '"typically". It does not give us a way of distinguishing intrinsic versus accidental properties. The latter saved W3C from retreading thousands of years of philosophical debate. The former is perhaps a medium-sized nuisance. Regardless: We can think of the class of things in the world that are *printers*. We can name that class with a URI and publish a description there. We can think of the class of things in the world that are *laser printers*. We can name that class with a URI and publish a description there. We can think of the class of things in the world that are *HP laser printers*. We can name that class with a URI and publish a description there. We can think of the class of things in the world that are *HP Laser Smart C4270 printers*. We can name that class with a URI and publish a description there. We can associate any thing in the world with one of more of these classes; in RDF by asserting an rdf:type relationship to the class. We can use properties associated with the class to describe the individual thing 'by hand', or we can draw factual conclusions about properties of some individual from general knowledge that makes claims about all members of a class. We can go deeper, towards query-like classes, and name the sub-class of HPLaserSmartC4270-Printer that corresponds to such printers bought in Leiden; or owned by me. Or that have a damaged scanner lid and which still serve adequately as a printer. Or which belong to the subclass manufactured in the UK and that shipped with a UK-compatible power cable. OWL doesn't impose any appropriate level of detail on us, it just provides descriptive primitives that let us talk in terms of [broadly] sets of things, the properties that characterise those sets, and the subset / superset relations between those sets. (We say class instead of set, and leave that distinction aside for now.) Computerised ontology languages like OWL are obsessed with this class-vs-instance distinction, and in modern mass produced life, the distinction is all around us, as are near-identical, mechanically reproduced copies of products - regardless of whether the product was designed to inform, educate, entertain, or remove unsightly nasal hair. Our FRBR-inspired conversations here are outshadowed by the need to make equivalent distinctions in other aspects of everyday life. From tracking down a replacement cable or scanner lid for my printer, to finding the nearest open shop that will sell me a certain kind of soldering iron on a sunday, or a certain DVD of a certain film, the desire to organize information in a way that mirrors the patterns of similarity amongst mass produced items is a modern universal. >From http://www.marxists.org/reference/subject/philosophy/works/ge/benjamin.htm http://en.wikipedia.org/wiki/The_Work_of_Art_in_the_Age_of_Mechanical_Reproduction and unfairly out of context, "In principle a work of art has always been reproducible. Man-made artifacts could always be imitated by men. Replicas were made by pupils in practice of their craft, by masters for diffusing their works, and, finally, by third parties in the pursuit of gain. Mechanical reproduction of a work of art, however, represents something new." [...] "With the woodcut graphic art became mechanically reproducible for the first time, long before script became reproducible by print. The enormous changes which printing, the mechanical reproduction of writing, has brought about in literature are a familiar story. However, within the phenomenon which we are here examining from the perspective of world history, print is merely a special, though particularly important, case." All I'm suggesting here is that we follow this advice from Walter Benjamin in 1936 and indulge ourselves in the idea that modeling bibliographic mass production is merely a special (and important) case. FRBR's "items" are the most concrete, tangible entities in the FRBR universe. In the physical realm they are things you might hold in your hand, put in a box, find at some location. The idea extended to the digital realm is naturally more ephemeral but we do at least have correspondingly objective characterstics that ground digital objects in clear ways: notions such as sizeInBytes, cryptographic hashes (sha1sum, md5) can be used to talk precisely about specific sequences of 'Zeros' and 'Ones'. Looking up the FRBR hierarchy at the more general notions of "Manifestation", "Expression" and "Work", these are FRBR's particular story for organizing our millions of items into sensible groups. FRBR's "work" notion is described textually as a “distinct intellectual or artistic creation.”... a kind of ghostly but specific entity, a kind of social fiction that acts as a descriptive (and sometimes legal) hub for organizing clusters of related items. "Expression" brings that somewhat down to earth (“the specific intellectual or artistic form that a work takes each time it is ‘realized.’”), while "Manifestion" finally articulates it in terms sets/classes rather than individual abstract entities: " “the physical embodiment of an expression of a work. As an entity, manifestation represents all the physical objects that bear the same characteristics, in respect to both intellectual content and physical form.”". So the distinctions made in terms of these *4* notions are similar to those baked into the core of RDF itself.... specific fairly concrete things organized into groups (sets, classes). RDF only allows itself 'rdf:type' and 'rdfs:subClassOf' relationships as a basis to describe all this. So if we go with this idea that "print is merely a special, though particularly important, case" of mass produced work, and that is it worth investigating RDF descriptive habits that address characteristics of mass production regardless of whether we are talking about bicycles, books, laser printers or farmyard equipment, ... where does this leave us? where does it get us? 1. We bring more clearly into scope some industrialised areas of cultural 'content' -- music, tv, films; http://musicontology.com/ http://www.bbc.co.uk/ontologies/programmes/2009-09-07.shtml ... areas where FRBR is a close but not perfect fit, and class-based models drift towards being 'FRBR-inspired' rather than 'FRBR-based'. 2. We find OWL lacks certain conventions for distinguishing stereotypical instances from flawed/accidental characteristics of actual instances. For eg. a copy of a some book I have on my desk might be missing a certain page, so its literal 'number of pages' property couldn't be inferred from a common class shared with other such manifestations of the same abstraction. Or the local adjustments made here to my printer (I swapped the power cable, or repaired the lid). There is a big literature in KR about defaults and overrides and it's tricky to get right with open-world design of RDF/OWL/RDFS. 3. Works, Manifestations and Expressions might all just be kinds of classes; or annotations on classes. The class of *HP Laser Smart C4270 printers* of which I have one in this room; the class of *SQL and Relational Theory books* of which I have one on my desk as I type. The former is described at http://h10025.www1.hp.com/ewfrf/wc/product?cc=us&lc=en&dlc=en&product=3300222 by its maker; the latter at http://oreilly.com/catalog/9780596523084 ... more general classes might be tagged 'work-class'; very precise classes tagged 'manifestation-class'. But fundamentally we get a huge, universal spectrum (from the class of 'every Thing', to the class of 'No-thing') rather than forcing each into one of the FRBR 4. In both these example cases, there are product codes and online databases, and other people who own different instances of the same kind of thing. In both cases there are related products (maybe an ebook, maybe a successor printer design, or ink cartridge) where information at the level of 'all products' is useful to the owners and custodians of specific products. 4. OWL 2.0's punning mechanism may be relevant. This is a trick in OWL 2 that lets a single URI serve both as a class identifier (the class of C4270printers) but also as an identifier at the instance level, eg. something that might have other data attached like images or links to product documentation. 5. We would effectively be abandoning the attempt to fit the bibliographic universe into 4 buckets, and allowing different parties to name and describe classes at any level of generality, picked out by the properties of the things in that class. I might care to name a class for all books written by all former pupils of the school described at http://en.wikipedia.org/wiki/RGS_High_Wycombe --- this class would include SQL and Relational Theory, via its author, http://dbpedia.org/page/Christopher_J._Date .... or you might care to create a class for products whose primary inventor was an immigrant. By stepping back from the FRBR 4, we could get a more free-form environment in which properties of all kinds of thing can be used to define whatever classes are useful. 6. What does this mean in terms of 'who defines what when' metadata practice? If the abstract work "SQL and Relational Theory" by C.J.Date is in some sense now an RDF class, what should the URI be? Who publishes it and what practice should exist around the associated online description? I don't know. Maybe authors, publishers and libraries all have a role, ... maybe there are 3 or more semi-competing URIs for that class, one from C.J.Date, one from the publisher O'Reilly or one or more from a library perspective. Perhaps one of these descriptive agencies ends up playing a hub role and including links to further description of the class from the other parties. Maybe practices vary between fields and types of product. I really don't know. And the core RDF/OWL specs are not the kinds of thing that will tell us what's best to do, btw. 7. What kinds of thing are properly expressed at the class level? I also don't know. We might find value in rethinking some properties to more explicitly attach them to the stereotypical ideal member of some class, as a way of admitting that not all instances will match the ideal. Perhaps for eg. the idea that books have 'numPages' could be defined to refer to the stereotypical ideal case, even while applied at the instance level. So if I lose 5 pages from the copy of "SQL and Relational Theory" on my desk, we still say it has 410 numbered pages. Maybe we go through and think 'which properties does it even make sense to mutate at the instance level?". For all the damage I could do to my copy of that book, I'm not going to change its author or subject, for example. So those would be readily expressed in terms of OWL. The numPages could be expressed as an OWL generalisation about all instances if we define that property to be the ideal number, rather than having to track damaged pages etc. And some properties such as geographic location or owner make sense only at the instance level. A few of these (such as e.g. initialOwner) might be static properties that never change their value; others vary from time to time. Ok this post is too long already. Another way of stating all this is that it's an appeal to think more in terms of specific somehow-concrete items, things. Artifacts in your hand, or computer data files that might be checksummed. And that all abstractions above those are means to an end, rather than ends in themselves. So we can ask whether, instead of pondering the vague characteristics of ghostly entities like 'works', 'expressions' and 'manifestations', whether we're simply talking about the common characteristics of collections of identifiable *items*. And if that is what we're doing, whether (a) we can more explicitly share common descriptive practices with other non-textual mass produced kinds of things (b) whether RDF/OWL might have some built-in facilities that could be used more (ie. its notion of class). This all wouldn't abolish the WEMI distinctions, rather they would as sketched above, show up as a kind of annotation on RDF classes. Some classes might be work-ish classes; the class of all Hamlets. Others might be manifestation-ish classes; the class of all paper-printed first edition SQL and Relational Theory copies. But the core organising idea is sets/classes rather than the ghostly upper entities of FRBR. Aspects of those entities would also show up as concrete documents; an artists first sketches of a later painting; CJ Date's book contract with O'Reilly that gave us the later book. First, second and final drafts; hp printer schematics, blueprints; architectural drawings; bike designs; ingredient lists and working notes. But rather than merge our knowledge about all those practical things into the vaguer composite entities of FRBR we just itemise them and describe them as plain old artifacts at the instance level - giving us something like a catalogue of evidence left in the world that shadows the creative process, rather than reifying the act of creation into special 'things' that can be described but never touched, used, read or consumed. Hope this all makes some sense. Related discussion from Bradley Allen, Karen and others: http://bpa.tumblr.com/post/10814190/faceted-classification-and-frbr http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg03837.html http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg03848.html http://bibwild.wordpress.com/2007/12/07/frbr-considered-as-set-relationships/ http://lists.w3.org/Archives/Public/public-owl-dev/2008JulSep/0110.html http://lists.w3.org/Archives/Public/public-lld/2010Sep/0049.html cheers, Dan ps. I tried to draw some of this out graphically: http://www.flickr.com/photos/danbri/2891150205/ ... story of a t-shirt design as frbr-inspired classes http://www.flickr.com/photos/danbri/2892286406/in/photostream/ ...same story as a timeline Received on Thursday, 17 March 2011 11:45:15 GMT • This message: [ Message body ] • Previous message: Young,Jeff (OR): "RE: Ontological constraints" • Mail actions: [ respond to this message ] [ mail a new topic ] • Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] • Help: [ How to use the archives ] [ Search in the archives ] This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 17 March 2011 11:45:16 GMT
Received on Thursday, 17 March 2011 19:12:37 UTC