Re: Kill the Record! (Was: BIBFRAME and schema.org) from Philip Schreur on 2013-07-05 (public-schemabibex@w3.org from July 2013)

From: Philip Schreur <pschreur@stanford.edu>
Date: Fri, 5 Jul 2013 11:04:10 -0700
To: "corey.harper@nyu.edu" <corey.harper@nyu.edu>
Cc: Karen Coyle <kcoyle@kcoyle.net>, "public-schemabibex@w3.org" <public-schemabibex@w3.org>
Message-Id: <C0CE563F-6D6A-402A-A226-18ADD5D8195F@stanford.edu>
+1

Sent from my iPhone

On Jul 5, 2013, at 10:57 AM, Corey A Harper <corey.harper@nyu.edu> wrote:

> Hi Karen,
> 
> Can you say a bit more about "I'm not convinced, having looked at some of the pages, that WP shares the conceptual model that we'll find in our data."? I'm not sure I understand what problems you foresee, nor what you believe the ramifications of those problems to be. 
> 
> I struggle with the idea that "..we then need to develop some best practices for library data, knowing that non-library data will take its own direction." I'm rather averse to maintaining our own little, non-conforming corner of the Web without a really clear understanding of the impact--on users--of this perceived conceptual incompatibility.
> 
> Thanks,
> -Corey
> 
> 
> 
> On Fri, Jul 5, 2013 at 1:47 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>> Yes, Jeff, I realize that. I had rather hoped for a link that you had found useful for books, like:
>> 
>> http://en.wikipedia.org/wiki/Category:Books_by_type
>> 
>> Naturally, this is a mish-mosh of physical types (paperback), product types (mass-market paperback), genres (airport novel) and topics (book size). I don't know if there is a better approach within WP.
>> 
>> While it is great that these Wikipedia pages exist, I think before using them we should look beyond their titles to the content of the pages to make sure that WP and our metadata are talking about the same thing. I'm not convinced, having looked at some of the pages, that WP shares the conceptual model that we'll find in our data. With that as a starting point, we then need to develop some best practices for library data, knowing that non-library data will take its own direction.
>> 
>> I would like to hear from anyone in the publishing community about their needs for specification of product types. I assume that the preferred list would original in ONIX.
>> 
>> kc
>> 
>> 
>> On 7/5/13 8:50 AM, Young,Jeff (OR) wrote:
>>> You can think of the option like this: Anything in Wikipedia can be
>>> treated as an owl:Class by changing the URI prefix. For example, this
>>> Wikipedia page describes murals:
>>> 
>>> http://en.wikipedia.org/wiki/Mural
>>> 
>>> In contrast, you can say something *is* a mural by using this hacked URI
>>> in an rdf:type:
>>> 
>>> http://www.productontology.org/id/Mural
>>> 
>>> Jeff
>>> 
>>> Sent from my iPad
>>> 
>>> On Jul 5, 2013, at 11:42 AM, "Karen Coyle" <kcoyle@kcoyle.net
>>> <mailto:kcoyle@kcoyle.net>> wrote:
>>> 
>>>> What are the options provided by productontology?
>>>> 
>>>> kc
>>>> 
>>>> On 7/5/13 8:26 AM, Young,Jeff (OR) wrote:
>>>>> True. This list has always seemed simplistic to me, though. As you've
>>>>> suggested, EBook in particular deserves to be treated as a class so
>>>>> more detailed properties can be included. The other two are just the
>>>>> tip if the iceberg.
>>>>> 
>>>>> Sent from my iPad
>>>>> 
>>>>> On Jul 5, 2013, at 11:20 AM, "Karen Coyle" <kcoyle@kcoyle.net
>>>>> <mailto:kcoyle@kcoyle.net>> wrote:
>>>>> 
>>>>>> Note that schema.org <http://schema.org> has
>>>>>> 
>>>>>> http://schema.org/BookFormatType, which has
>>>>>> 
>>>>>> Ebook
>>>>>> Hardback
>>>>>> Paperback
>>>>>> 
>>>>>> kc
>>>>>> 
>>>>>> On 7/5/13 7:43 AM, Young,Jeff (OR) wrote:
>>>>>>> For paperbacks and similar things, I've started using Product Ontology
>>>>>>> to tag the item/manifestation descriptions for example:
>>>>>>> 
>>>>>>> @prefix schema: <http://schema.org/> .
>>>>>>> @prefix pto: <http://www.productontology.org/id/> .
>>>>>>> 
>>>>>>> :book1
>>>>>>>     a schema:Book, schema:ProductModel, pto:Paperback ;
>>>>>>>     etc.
>>>>>>> 
>>>>>>> The coverage isn't perfect, but it has the advantage of being backed up
>>>>>>> by Wikipedia.
>>>>>>> 
>>>>>>> Jeff
>>>>>>> 
>>>>>>> Sent from my iPad
>>>>>>> 
>>>>>>> On Jul 5, 2013, at 10:35 AM, "Ross Singer" <rxs@talis.com
>>>>>>> <mailto:rxs@talis.com>
>>>>>>> <mailto:rxs@talis.com>> wrote:
>>>>>>> 
>>>>>>>> On Jul 5, 2013, at 10:25 AM, "Young,Jeff (OR)" <jyoung@oclc.org
>>>>>>>> <mailto:jyoung@oclc.org>
>>>>>>>> <mailto:jyoung@oclc.org>> wrote:
>>>>>>>>> 
>>>>>>>>> Aside, I would argue that the defining characteristic of Item is that
>>>>>>>>> it has "location". For physical items that location can be determined
>>>>>>>>> by geolocation (for example). For Web items (aka Web documents), the
>>>>>>>>> location can be determined by its URL.
>>>>>>>> 
>>>>>>>> +1
>>>>>>>> 
>>>>>>>> I would say there are arguably more defining characteristics than that
>>>>>>>> (I'm still going to argue that "paperback" isn't actually a part of
>>>>>>>> the manifestation, simply an inference of the sum of the format of the
>>>>>>>> items), but this, I would argue, is definitely the least common
>>>>>>>> denominator and applies well for our entity model in schema.org
>>>>>>>> <http://schema.org>
>>>>>>>> <http://schema.org>.
>>>>>>>> 
>>>>>>>> -Ross.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jeff
>>>>>>>>> 
>>>>>>>>> Sent from my iPad
>>>>>>>>> 
>>>>>>>>> On Jul 5, 2013, at 9:55 AM, "Ross Singer" <rxs@talis.com
>>>>>>>>> <mailto:rxs@talis.com>
>>>>>>>>> <mailto:rxs@talis.com>> wrote:
>>>>>>>>> 
>>>>>>>>>> But this all really how many angels can fit on the head of a pin,
>>>>>>>>>> isn't it?
>>>>>>>>>> 
>>>>>>>>>> We've already established that we're not interested in defining any
>>>>>>>>>> strict interpretation of FRBR in schema.org <http://schema.org>
>>>>>>>>>> <http://schema.org/>:
>>>>>>>>>> we're just trying to define a way to describe things in HTML that
>>>>>>>>>> computers can parse.
>>>>>>>>>> 
>>>>>>>>>> Yes, I think we need to establish what an item is, no I don't think
>>>>>>>>>> we have to use FRBR as a strict guide.
>>>>>>>>>> 
>>>>>>>>>> -Ross.
>>>>>>>>>> 
>>>>>>>>>> On Jul 5, 2013, at 8:51 AM, James Weinheimer
>>>>>>>>>> <weinheimer.jim.l@gmail.com <mailto:weinheimer.jim.l@gmail.com>
>>>>>>>>>> <mailto:weinheimer.jim.l@gmail.com>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> On 05/07/2013 13:30, Ross Singer wrote:
>>>>>>>>>>> <snip>
>>>>>>>>>>>> 
>>>>>>>>>>>> I guess I don't understand why offering epub, pdf, and html
>>>>>>>>>>>> versions of the same resource doesn't constitute "items".
>>>>>>>>>>>> 
>>>>>>>>>>>> If you look at an article in arxiv.org <http://arxiv.org>
>>>>>>>>>>>> <http://arxiv.org/>, for
>>>>>>>>>>>> example, where else in WEMI would you put the available file
>>>>>>>>>>>> formats?
>>>>>>>>>>>> 
>>>>>>>>>>>> Basically, format should be tied to the item, although for
>>>>>>>>>>>> physical items, any manifestation's item will generally be the
>>>>>>>>>>>> same format (although I don't see why a scan of a paperback would
>>>>>>>>>>>> become a new endeavor, honestly).
>>>>>>>>>>>> 
>>>>>>>>>>>> In the end, I don't see how digital is any different than print in
>>>>>>>>>>>> this regard.
>>>>>>>>>>> </snip>
>>>>>>>>>>> 
>>>>>>>>>>> Because manifestations are defined by their format (among other
>>>>>>>>>>> things). Therefore, a movie of, e.g. Moby Dick that is a
>>>>>>>>>>> videocassette is considered to be a different manifestation from
>>>>>>>>>>> that of a DVD. Each one is described separately. So, if you have
>>>>>>>>>>> multiple copies of the same format for the same content those are
>>>>>>>>>>> called copies. But if you have different formats for the same
>>>>>>>>>>> content, those are different manifestations.
>>>>>>>>>>> 
>>>>>>>>>>> The examples in arxiv.org <http://arxiv.org>
>>>>>>>>>>> <http://arxiv.org/> are just like I
>>>>>>>>>>> mentioned in archive.org <http://archive.org>
>>>>>>>>>>> <http://archive.org/> and they follow a
>>>>>>>>>>> different sort of structure. You do not see this in a library
>>>>>>>>>>> catalog, where each format will get a different manifestation, so
>>>>>>>>>>> that each format can be described.
>>>>>>>>>>> 
>>>>>>>>>>> As a result, things work quite differently. Look for e.g. Moby Dick
>>>>>>>>>>> in Worldcat, and you will see all kinds of formats available in the
>>>>>>>>>>> left-hand column.
>>>>>>>>>>> https://www.worldcat.org/search?qt=worldcat_org_all&q=moby+dick
>>>>>>>>>>> 
>>>>>>>>>>> When you click on an individual record,
>>>>>>>>>>> http://www.worldcat.org/oclc/62208367 you will see where all of the
>>>>>>>>>>> copies of this particular format of this particular expression are
>>>>>>>>>>> located. This is the manifestation. And its purpose is to organize
>>>>>>>>>>> all of the *copies*, as is done here.
>>>>>>>>>>> 
>>>>>>>>>>> In the IA, we see something different:
>>>>>>>>>>> http://archive.org/details/mobydickorwhale02melvuoft, where this
>>>>>>>>>>> display brings together the different manifestations: pdf, text,
>>>>>>>>>>> etc. There is no corresponding concept in FRBR for what we see in
>>>>>>>>>>> the Internet Archive, or in arxiv.org <http://arxiv.org>
>>>>>>>>>>> <http://arxiv.org/>.
>>>>>>>>>>> 
>>>>>>>>>>> I am not complaining or finding fault, but what I am saying is that
>>>>>>>>>>> the primary reason this sort of thing works for digital materials
>>>>>>>>>>> is because there are no real "duplicates". (There are other serious
>>>>>>>>>>> problems that I won't mention here) In my opinion, introducing the
>>>>>>>>>>> Internet Archive-type structure into a library-type catalog based
>>>>>>>>>>> on physical materials with multitudes of copies would result in a
>>>>>>>>>>> completely incoherent hash.
>>>>>>>>>>> 
>>>>>>>>>>> This is why I am saying that FRBR does not translate well to
>>>>>>>>>>> digital materials on the internet.
>>>>>>>>>>> 
>>>>>>>>>>> Getting rid of the concept of the "record" has been the supposed
>>>>>>>>>>> remedy, but it seems to me that the final result (i.e. what the
>>>>>>>>>>> user will experience) will still be the incoherent mash I mentioned
>>>>>>>>>>> above: where innumerable items and multiple manifestations will be
>>>>>>>>>>> mashed together. Perhaps somebody could come up with a way to make
>>>>>>>>>>> this coherent and useful, but I have never seen anything like it
>>>>>>>>>>> and cannot imagine how it could work.
>>>>>>>>>>> --
>>>>>>>>>>> *James Weinheimer* weinheimer.jim.l@gmail.com
>>>>>>>>>>> <mailto:weinheimer.jim.l@gmail.com>
>>>>>>>>>>> 
>>>>>>>>>>> *First Thus* http://catalogingmatters.blogspot.com/
>>>>>>>>>>> *First Thus Facebook Page* https://www.facebook.com/FirstThus
>>>>>>>>>>> *Cooperative Cataloging Rules*
>>>>>>>>>>> http://sites.google.com/site/opencatalogingrules/
>>>>>>>>>>> *Cataloging Matters Podcasts*
>>>>>>>>>>> http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html
>>>>>> 
>>>>>> --
>>>>>> Karen Coyle
>>>>>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>>>>> 
>>>>>> ph: 1-510-540-7596
>>>>>> m: 1-510-435-8234
>>>>>> skype: kcoylenet
>>>> 
>>>> --
>>>> Karen Coyle
>>>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>>> 
>>>> ph: 1-510-540-7596
>>>> m: 1-510-435-8234
>>>> skype: kcoylenet
>> 
>> -- 
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
> 
> 
> 
> -- 
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479
> corey.harper@nyu.edu
Received on Friday, 5 July 2013 18:04:42 UTC