- From: Tom Adamich <vls@tusco.net>
- Date: Sat, 6 Jul 2013 07:11:09 -0400
- To: <kcoyle@kcoyle.net>, <public-schemabibex@w3.org>
- Cc: "'Young,Jeff \(OR\)'" <jyoung@oclc.org>, "'Wallis,Richard'" <Richard.Wallis@oclc.org>, <David.Newman@wellsfargo.com>, <godby@oclc.org>, <em@zepheira.com>
Thanks, Karen, for leading this discussion back to the "library-centric" mission of both SchemaBibEx and BIBFRAME. Yes, the metadata has the potential to be leveraged in other environments (including commercial enterprise); however, I agree with your request to remain on task and reminding us of the timeframe associated with this group's work. ...Lead on:) Tom Tom Adamich, MLS President Visiting Librarian Service P.O. Box 932 New Philadelphia, OH 44663 330-364-4410 vls@tusco.net -----Original Message----- From: Karen Coyle [mailto:kcoyle@kcoyle.net] Sent: Friday, July 05, 2013 5:35 PM To: public-schemabibex@w3.org Subject: Re: Kill the Record! (Was: BIBFRAME and schema.org) Corey, I share your fear about over-engineering. I tend to put use of productOntology in that category, though, because examples I've seen make use of greater detail than I think we currently represent in library data online -- and I'm not convinced that more detail is needed. Users seem to care about whether something is print, online, or on disk (DVD, CD). We've started mixing books and articles (print and online) in our discovery systems, and users seem comfortable with that. I suspect that they favor "can I get it now?" as a primary selection criterion. Hardback and paperback? Not so much. This is why I'd like to understand better what publishers need, since they have a different use case: different versions and formats have different prices, and they need to show that. For a library, I doubt if "paperback" and "hardback" are deciding selection factors for users. When I see examples that have these in them it is a bit jarring, especially since that data isn't reliably coded in our records. I would prefer to initially base schema.org thinking on library *displays* rather than library *records*. It's rather astonishing how little of what is coded in MARC ends up on the screen in the basic user displays, as well as how little of it feeds indexing. I second an earlier comment by Ed Summers that we should concentrate on what we can do today with schema.org, and add to it as library data online undergoes changes that require new capabilities. Current displays are a place to start, and once we have conquered those we can move on. Remember, this group is supposed to disband in Fall of 2013. Thus, once again, can we look at holdings displays and come up with a reasonable solution? I think that schema.org has a good 90% or more of what we need for basic bibliographic description. But getting users to library holdings isn't yet covered. kc On 7/5/13 1:16 PM, Corey A Harper wrote: > Hi Karen, > > I take your point, and agree that it's really a question of what we > intend to convey. I just worry very much that this group has been > inclined to over-engineer much of this, and as a result will render it > not very useful to anyone outside of a very small group -- ostensibly > the same very small group that are perfectly comfortable with MARC now. > If that's what we're trying to do, then honestly, my vote becomes to > just stick with MARC -- we don't gain much if we decide to build > something new from whole cloth instead of looking seriously at the > patterns that others--those we want to work with--are already using. > That said, I checked some schema.org <http://schema.org> deployments of > books (kmart & B&N) and found no product typing at all, so it could be > that common usage hasn't been established yet. > > I agree re: availability of statistics. I suspect we may have to rely on > ourselves for that. I often mention commoncrawl here, but will again, as > they make 40 TB worth of data from over 5 billion web pages available, > have it hosted on AWS, and even provide tutorials for running EC2 Map > Reduce jobs against it: > http://aws.amazon.com/datasets/41740 > http://commoncrawl.org/mapreduce-for-the-masses/ > > I suspect searching for the productontology.org > <http://productontology.org> prefix somewhere in microdata or rdfa > across the full set would probably cost a couple hundred bucks on EC2, > though. If someone had 40TB of space kicking around in a hadoop cluster > of their own, though.... > > My gut feeling, regardless, is that YES, we should use that "Monographic > Series" article, as well as others. If we make this a prominent usage > pattern, I believe the library community will spend the time cleaning > these articles up, and adding new ones where there are gaps. Perhaps in > the process we make both WikiPedia AND the Product Ontology AND > schema.org <http://schema.org> better than they are now. > > -Corey > > > > > On Fri, Jul 5, 2013 at 3:01 PM, Karen Coyle <kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net>> wrote: > > Cory, I don't think that what I propose is "non-conforming." I think > we need to make choices amongst the conforming ones. I assume that > we will be making some kind of cross-walk from library data to > schema.org <http://schema.org>, and that best practice will be that > coded format x (e.g. from the LDR or 007 in MARC) will have a > defined value in schema.org <http://schema.org> that means > approximately the same thing. Do we choose "paperback", "mass > paperback" or just "book"? It really is a question of what we intend > to convey with the schema.org <http://schema.org> data, what we see > it linking to most usefully, what is most accurate, and what is > going to be easiest to produce. > > As an example, if you look at that list on WP you see that it has > "book series", which is primarily what libraries would call > "readers' series" - Harry Potter, "A is for Alibi...," "Narnia", > etc. So although it says "series" it isn't the same as what is in an > 8XX field. There IS an article for "monographic series". The > monographic series article is pretty piss-poor, however, and needs a > serious amount of work. Should we use it as is? Does it represent > the same concept as the 8XX fields? > > I love WP, I do, but there's a great variation in the quality of the > pages. Nothing on WP can be taken at face value - we need to be > smart about it, and even pro-active, if we are to take WP links to > be *definitional* of our data elements. I'm not comfortable with > assuming that any page on WP is by definition authoritative. (I'm in > the midst of a huge revision of the DDC pages which were TOTALLY > inaccurate, so this is something I'm painfully aware of at the > moment.) In addition, we will have to make choices when WP divides > the world differently from us. > > Finally, although productontology is available for use, it isn't the > only possibility. I know that Jeff favors it, but we need to keep an > eye on practice to see if it becomes standard practice, and if it is > used by search engines. I hope that some statistics will be > available that provide guidance. > > kc > > > On 7/5/13 10:57 AM, Corey A Harper wrote: > > Hi Karen, > > Can you say a bit more about "I'm not convinced, having looked > at some > of the pages, that WP shares the conceptual model that we'll > find in our > data."? I'm not sure I understand what problems you foresee, nor > what > you believe the ramifications of those problems to be. > > I struggle with the idea that "..we then need to develop some best > practices for library data, knowing that non-library data will > take its > own direction." I'm rather averse to maintaining our own little, > non-conforming corner of the Web without a really clear > understanding of > the impact--on users--of this perceived conceptual incompatibility. > > Thanks, > -Corey > > > > On Fri, Jul 5, 2013 at 1:47 PM, Karen Coyle <kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>>> wrote: > > Yes, Jeff, I realize that. I had rather hoped for a link > that you > had found useful for books, like: > > http://en.wikipedia.org/wiki/____Category:Books_by_type > <http://en.wikipedia.org/wiki/__Category:Books_by_type> > > <http://en.wikipedia.org/wiki/__Category:Books_by_type > <http://en.wikipedia.org/wiki/Category:Books_by_type>> > > Naturally, this is a mish-mosh of physical types (paperback), > product types (mass-market paperback), genres (airport > novel) and > topics (book size). I don't know if there is a better approach > within WP. > > While it is great that these Wikipedia pages exist, I think > before > using them we should look beyond their titles to the > content of the > pages to make sure that WP and our metadata are talking > about the > same thing. I'm not convinced, having looked at some of the > pages, > that WP shares the conceptual model that we'll find in our > data. > With that as a starting point, we then need to develop some > best > practices for library data, knowing that non-library data > will take > its own direction. > > I would like to hear from anyone in the publishing > community about > their needs for specification of product types. I assume > that the > preferred list would original in ONIX. > > kc > > > On 7/5/13 8:50 AM, Young,Jeff (OR) wrote: > > You can think of the option like this: Anything in > Wikipedia can be > treated as an owl:Class by changing the URI prefix. For > example, > this > Wikipedia page describes murals: > > http://en.wikipedia.org/wiki/____Mural > <http://en.wikipedia.org/wiki/__Mural> > > <http://en.wikipedia.org/wiki/__Mural > <http://en.wikipedia.org/wiki/Mural>> > > In contrast, you can say something *is* a mural by > using this > hacked URI > in an rdf:type: > > http://www.productontology.____org/id/Mural > > <http://www.productontology.__org/id/Mural > <http://www.productontology.org/id/Mural>> > > Jeff > > Sent from my iPad > > On Jul 5, 2013, at 11:42 AM, "Karen Coyle" > <kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>>>> wrote: > > What are the options provided by productontology? > > kc > > On 7/5/13 8:26 AM, Young,Jeff (OR) wrote: > > True. This list has always seemed simplistic to me, > though. As you've > suggested, EBook in particular deserves to be > treated as > a class so > more detailed properties can be included. The > other two > are just the > tip if the iceberg. > > Sent from my iPad > > On Jul 5, 2013, at 11:20 AM, "Karen Coyle" > <kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> > <mailto:kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net> <mailto:kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net>>>> > > wrote: > > Note that schema.org <http://schema.org> > <http://schema.org> > <http://schema.org> has > > http://schema.org/____BookFormatType > <http://schema.org/__BookFormatType> > <http://schema.org/__BookFormatType > <http://schema.org/BookFormatType>>, which has > > > Ebook > Hardback > Paperback > > kc > > On 7/5/13 7:43 AM, Young,Jeff (OR) wrote: > > For paperbacks and similar things, I've > started > using Product Ontology > to tag the item/manifestation > descriptions for > example: > > @prefix schema: <http://schema.org/> . > @prefix pto: > <http://www.productontology.____org/id/ > > <http://www.productontology.__org/id/ > <http://www.productontology.org/id/>>> . > > :book1 > a schema:Book, schema:ProductModel, > pto:Paperback ; > etc. > > The coverage isn't perfect, but it has the > advantage of being backed up > by Wikipedia. > > Jeff > > Sent from my iPad > > On Jul 5, 2013, at 10:35 AM, "Ross Singer" > <rxs@talis.com <mailto:rxs@talis.com> > <mailto:rxs@talis.com <mailto:rxs@talis.com>> > <mailto:rxs@talis.com > <mailto:rxs@talis.com> <mailto:rxs@talis.com > <mailto:rxs@talis.com>>> > > <mailto:rxs@talis.com > <mailto:rxs@talis.com> <mailto:rxs@talis.com > <mailto:rxs@talis.com>>>> > wrote: > > On Jul 5, 2013, at 10:25 AM, > "Young,Jeff > (OR)" <jyoung@oclc.org > <mailto:jyoung@oclc.org> <mailto:jyoung@oclc.org > <mailto:jyoung@oclc.org>> > <mailto:jyoung@oclc.org > <mailto:jyoung@oclc.org> > <mailto:jyoung@oclc.org > <mailto:jyoung@oclc.org>>> > > <mailto:jyoung@oclc.org > <mailto:jyoung@oclc.org> > <mailto:jyoung@oclc.org > <mailto:jyoung@oclc.org>>>> wrote: > > > Aside, I would argue that the > defining > characteristic of Item is that > it has "location". For physical > items > that location can be determined > by geolocation (for example). > For Web > items (aka Web documents), the > location can be determined by > its URL. > > > +1 > > I would say there are arguably more > defining > characteristics than that > (I'm still going to argue that > "paperback" > isn't actually a part of > the manifestation, simply an > inference of > the sum of the format of the > items), but this, I would argue, is > definitely the least common > denominator and applies well for > our entity > model in schema.org > <http://schema.org> <http://schema.org> > <http://schema.org> > <http://schema.org>. > > -Ross. > > > Jeff > > Sent from my iPad > > On Jul 5, 2013, at 9:55 AM, "Ross > Singer" <rxs@talis.com > <mailto:rxs@talis.com> > <mailto:rxs@talis.com > <mailto:rxs@talis.com>> > <mailto:rxs@talis.com > <mailto:rxs@talis.com> > <mailto:rxs@talis.com > <mailto:rxs@talis.com>>> > > <mailto:rxs@talis.com > <mailto:rxs@talis.com> > <mailto:rxs@talis.com > <mailto:rxs@talis.com>>>> wrote: > > But this all really how > many angels > can fit on the head of a pin, > isn't it? > > We've already established > that we're > not interested in defining any > strict interpretation of > FRBR in > schema.org <http://schema.org> <http://schema.org> > <http://schema.org> > <http://schema.org/>: > we're just trying to define > a way to > describe things in HTML that > computers can parse. > > Yes, I think we need to > establish > what an item is, no I don't > think > we have to use FRBR as a > strict guide. > > -Ross. > > On Jul 5, 2013, at 8:51 AM, > James > Weinheimer > <weinheimer.jim.l@gmail.com > <mailto:weinheimer.jim.l@gmail.com> > > <mailto:weinheimer.jim.l@__gmail.com > <mailto:weinheimer.jim.l@gmail.com>> > <mailto:weinheimer.jim.l@ > <mailto:weinheimer.jim.l@>__gma__il.com <http://gmail.com> > > <mailto:weinheimer.jim.l@__gmail.com > <mailto:weinheimer.jim.l@gmail.com>>> > <mailto:weinheimer.jim.l@ > <mailto:weinheimer.jim.l@>__gma__il.com <http://gmail.com> > > > <mailto:weinheimer.jim.l@__gmail.com > <mailto:weinheimer.jim.l@gmail.com>>>> wrote: > > On 05/07/2013 13:30, > Ross Singer > wrote: > <snip> > > > I guess I don't > understand > why offering epub, > pdf, and html > versions of the same > resource doesn't > constitute > "items". > > If you look at an > article in > arxiv.org <http://arxiv.org> <http://arxiv.org> > <http://arxiv.org> > > <http://arxiv.org/>, for > example, where else > in WEMI > would you put the > available file > formats? > > Basically, format > should be > tied to the item, > although for > physical items, any > manifestation's > item will > generally be the > same format (although I > don't see why a > scan of a > paperback would > become a new endeavor, > honestly). > > In the end, I don't > see how > digital is any > different > than print in > this regard. > > </snip> > > Because manifestations are > defined by their format > (among other > things). Therefore, a > movie of, > e.g. Moby Dick that is a > videocassette is > considered to > be a different > manifestation from > that of a DVD. Each one is > described separately. > So, if you > have > multiple copies of the same > format for the same content > those are > called copies. But if > you have > different formats for > the same > content, those are > different > manifestations. > > The examples in > arxiv.org <http://arxiv.org> > <http://arxiv.org> > <http://arxiv.org> > <http://arxiv.org/> are > just like I > mentioned in > archive.org <http://archive.org> > <http://archive.org> > <http://archive.org> > <http://archive.org/> > and they > follow a > different sort of > structure. You > do not see this in a > library > catalog, where each > format will > get a different > manifestation, so > that each format can be > described. > > As a result, things > work quite > differently. Look for > e.g. Moby Dick > in Worldcat, and you > will see > all kinds of formats > available > in the > left-hand column. > https://www.worldcat.org/____search?qt=worldcat_org_all&q=____moby+dick > <https://www.worldcat.org/__search?qt=worldcat_org_all&q=__moby+dick> > > > <https://www.worldcat.org/__search?qt=worldcat_org_all&q=__moby+dick > <https://www.worldcat.org/search?qt=worldcat_org_all&q=moby+dick>> > > When you click on an > individual > record, > http://www.worldcat.org/oclc/____62208367 > <http://www.worldcat.org/oclc/__62208367> > > > <http://www.worldcat.org/oclc/__62208367 > <http://www.worldcat.org/oclc/62208367>> > you will see where all > of the > copies of this > particular format > of this particular > expression are > located. This is the > manifestation. And its > purpose > is to organize > all of the *copies*, as > is done > here. > > In the IA, we see something > different: > http://archive.org/details/____mobydickorwhale02melvuoft > <http://archive.org/details/__mobydickorwhale02melvuoft> > > <http://archive.org/details/__mobydickorwhale02melvuoft > <http://archive.org/details/mobydickorwhale02melvuoft>>, > > where this > display brings together the > different > manifestations: pdf, text, > etc. There is no > corresponding > concept in FRBR for > what we see in > the Internet Archive, or in > arxiv.org <http://arxiv.org> <http://arxiv.org> > <http://arxiv.org> > <http://arxiv.org/>. > > I am not complaining or > finding > fault, but what I am > saying is that > the primary reason this > sort of > thing works for digital > materials > is because there are no > real > "duplicates". (There > are other > serious > problems that I won't > mention > here) In my opinion, > introducing the > Internet Archive-type > structure > into a library-type > catalog based > on physical materials with > multitudes of copies would > result in a > completely incoherent hash. > > This is why I am saying > that > FRBR does not translate > well to > digital materials on > the internet. > > Getting rid of the > concept of > the "record" has been > the supposed > remedy, but it seems to > me that > the final result (i.e. > what the > user will experience) > will still > be the incoherent mash > I mentioned > above: where > innumerable items > and multiple > manifestations will be > mashed together. Perhaps > somebody could come up > with a > way to make > this coherent and > useful, but I > have never seen > anything like it > and cannot imagine how > it could > work. > -- > *James Weinheimer* > weinheimer.jim.l@gmail.com <mailto:weinheimer.jim.l@gmail.com> > > <mailto:weinheimer.jim.l@__gmail.com > <mailto:weinheimer.jim.l@gmail.com>> > > <mailto:weinheimer.jim.l@ > <mailto:weinheimer.jim.l@>__gma__il.com <http://gmail.com> > > <mailto:weinheimer.jim.l@__gmail.com > <mailto:weinheimer.jim.l@gmail.com>>> > > *First Thus* > http://catalogingmatters.__blo__gspot.com/ <http://blogspot.com/> > > > <http://catalogingmatters.__blogspot.com/ > <http://catalogingmatters.blogspot.com/>> > *First Thus Facebook Page* > https://www.facebook.com/____FirstThus > <https://www.facebook.com/__FirstThus> > > > <https://www.facebook.com/__FirstThus > <https://www.facebook.com/FirstThus>> > *Cooperative Cataloging > Rules* > http://sites.google.com/site/____opencatalogingrules/ > <http://sites.google.com/site/__opencatalogingrules/> > > > <http://sites.google.com/site/__opencatalogingrules/ > <http://sites.google.com/site/opencatalogingrules/>> > *Cataloging Matters > Podcasts* > http://blog.jweinheimer.net/p/____cataloging-matters-podcasts.____html > <http://blog.jweinheimer.net/p/__cataloging-matters-podcasts.__html> > > > <http://blog.jweinheimer.net/__p/cataloging-matters-podcasts.__html > <http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html>> > > > > > -- > Karen Coyle > kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> > <mailto:kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net>>> http://kcoyle.net > > ph: 1-510-540-7596 <tel:1-510-540-7596> > <tel:1-510-540-7596 <tel:1-510-540-7596>> > m: 1-510-435-8234 <tel:1-510-435-8234> > <tel:1-510-435-8234 <tel:1-510-435-8234>> > > skype: kcoylenet > > > > > > > -- > Karen Coyle > kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> > <mailto:kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net> <mailto:kcoyle@kcoyle.net > <mailto:kcoyle@kcoyle.net>>> > http://kcoyle.net > > ph: 1-510-540-7596 <tel:1-510-540-7596> > <tel:1-510-540-7596 <tel:1-510-540-7596>> > m: 1-510-435-8234 <tel:1-510-435-8234> > <tel:1-510-435-8234 <tel:1-510-435-8234>> > > skype: kcoylenet > > > > -- > Karen Coyle > kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > <mailto:kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net>> > http://kcoyle.net > ph: 1-510-540-7596 <tel:1-510-540-7596> <tel:1-510-540-7596 > <tel:1-510-540-7596>> > m: 1-510-435-8234 <tel:1-510-435-8234> <tel:1-510-435-8234 > <tel:1-510-435-8234>> > > skype: kcoylenet > > > > > -- > Corey A Harper > Metadata Services Librarian > New York University Libraries > 20 Cooper Square, 3rd Floor > New York, NY 10003-7112 > 212.998.2479 <tel:212.998.2479> > corey.harper@nyu.edu <mailto:corey.harper@nyu.edu> > <mailto:corey.harper@nyu.edu <mailto:corey.harper@nyu.edu>> > > > -- > Karen Coyle > kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net > ph: 1-510-540-7596 <tel:1-510-540-7596> > m: 1-510-435-8234 <tel:1-510-435-8234> > skype: kcoylenet > > > > > -- > Corey A Harper > Metadata Services Librarian > New York University Libraries > 20 Cooper Square, 3rd Floor > New York, NY 10003-7112 > 212.998.2479 <tel:212.998.2479> > corey.harper@nyu.edu <mailto:corey.harper@nyu.edu> -- Karen Coyle kcoyle@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Received on Saturday, 6 July 2013 11:12:02 UTC