- From: Laura Dawson <Laura.Dawson@bowker.com>
- Date: Thu, 25 Sep 2014 15:06:25 +0000
- To: Bill Kasdorf <bkasdorf@apexcovantage.com>, "Todd Carpenter (Gmail)" <tcarpenter@niso.org>, Koji Ishii <kojiishi@gluesoft.co.jp>
- CC: Ivan Herman <ivan@w3.org>, "David (Standards) Singer" <singer@apple.com>, Laura Dawson <ljndawson@gmail.com>, Graham Bell <graham@editeur.org>, "Phil Madans" <Phil.Madans@hbgusa.com>, "W3C Public Digital Publishing IG Mailing List" <public-digipub-ig-comment@w3.org>
Amen, Bill. I could not agree more. On 9/25/14, 11:04 AM, "Bill Kasdorf" <bkasdorf@apexcovantage.com> wrote: >I also want to point out that what we really need is not just about books. > >Even though there has been frequent discussion on the IG about whether we >can _focus_ on books (and the consensus, which I reluctantly went along >with, is yes), for something this fundamental we really need to think in >terms of a _publication_ or even a _resource_. > >Even in traditionally book-dominated sectors like educational publishing, >there is a rapid movement away from the concept of a "book" at all. >Professors increasingly are willing to let students use any of a range of >"textbooks" as a resource for, say, calculus or microbiology, as long as >they are useful and have information that is relevant to the course. >Increasingly those "books" themselves are being deconstructed, and more >importantly most big educational publishers are moving toward a vision in >which they develop resources first and books (or parts of books) are just >one of many ways of associating, combining, and distributing those >resources. And that is done in the context of _all the other stuff out >there_ (mostly but not exclusively on the Web). > >All that stuff has to be able to be identified, cited, annotated, etc. >etc. > >I could have written that description just as well in the context of >magazines, for which _exactly the same dynamic_ is happening. Right now. > >Same for scholarly/STM publishing (where publishing _data_--and citing >datasets--is a very live issue). And even in the humanities, where >"Digital Humanities" is becoming mainstream (and which is about "works" >in the FRBR sense). > >And think of all the resources needed in corporate publishing, training, >etc. > >All of that is "publishing." No publication exists in a closed system. It >may think it is in a walled garden but there is a giant jungle outside >its walls. > >I really think in the pursuit of this identifier issue we MUST take the >broadest possible vision or we will come up with something that is useful >in one sector (perhaps) but not truly interoperable in the publishing >ecosystem and the web in general (the context in which the publishing >ecosystem increasingly lives and works) and will thus ultimately prove >inadequate. > >This is not to replace domain-specific or purpose-built identifiers like >the DOI, the ISBN, etc.--those that, as Todd and others pointed out, have >metadata and systems associated with them to DO THINGS. Any identifier we >come up with should not make those obsolete and ideally should not >conflict with them at all. It should make them more interoperable and >more useful. This is not a Battle of Identifiers, and those who think One >and Only One Identifier is the goal are mistaken. Many identifiers are >needed because we need to do many different things with them. > >But the identifier we are looking for here--enabling annotation and a >myriad other related things on the Web (citation, previews, chunking, >etc.)--needs to be radically widely applicable, completely agnostic as to >the type of publication or resource it identifies, the format in which >that publication or resource is disseminated, and yet durable, >persistent, and reliable across formats and across time. > >--Bill Kasdorf > >-----Original Message----- >From: Laura Dawson [mailto:Laura.Dawson@bowker.com] >Sent: Thursday, September 25, 2014 9:01 AM >To: Todd Carpenter (Gmail); Koji Ishii >Cc: Ivan Herman; David (Standards) Singer; Laura Dawson; Bill Kasdorf; >Graham Bell; Phil Madans; W3C Public Digital Publishing IG Mailing List >Subject: Re: As an aside, a possibly interesting read.... > >Todd, I think you're absolutely right about the difference between >librarianship and the trade. It has been the function of libraries to >archive, curate, and canonize information since their inception. Trade is >about one thing and one thing only - sales. In building infrastructure, >we need to support both. What both have in common is a need for effective >discovery - directing a reader to the book they want. So much of the >metadata will be shared in common - that which describes the book; the >metadata describing the terms by which a reader may have it will differ >depending on.well, the terms - the environment in which the reader is >discovering the book. > >That all said, I can envision a world where - for the purposes of >curation and archiving - there exists a "canonical" version of a book at >a URI that could well consist of the ISBN for that book (as Koji >described), but if you want to own the book, you are directed to >whichever platforms support it, and you choose which one you want to read >on. But that presupposes an authority to govern that system. I would say >the ISBN-International Agency could be that authority, but there is one >important issue that prevents that - no publisher is required to report >back to ISBN-IA which ISBNs get assigned to which books. ISBNs are issued >in blocks - and in the case of larger publishers, many never see the >light of day. ISBN-IA does not maintain a database of the ISBNs that get >assigned - that is down to the registration agencies (such as Bowker, >Nielsen, national libraries). And the publishers don't always report back >to the RA's which numbers they are assigning to which things. > >Also to be considered - in a world of self-publishing, ISBNs frequently >are not assigned at all. Books are available in proprietary systems only >(Kindle), and not easily discoverable. Amazon is said to be publishing >about 2000 of these per week. We have no idea what they are, if they are >books or "shorts", fiction, memoir, cookbooks - only Amazon has that >data, and the data is provided by author/publishers who are not >necessarily familiar with metadata conventions and effective description. > >So, to be succinct, whether distributed or centralized, we need to break >down the specific problems based on audience and the pain we're trying to >solve. Probably won't be a single solution. > >On 9/25/14, 2:58 AM, "Todd Carpenter (Gmail)" <tcarpenter@niso.org> wrote: > >>There is a tremendous problem with distributed systems when it comes to >>canonical information and standard identifiers. That being the >>metadata that is associated with that identifier. An identifier is (or >>better put should be) just a dumb (i.e., without embedded meaning), >>unique set of string of characters. The structure of that string, while >>systematically important is beside the point. Whether an identifier is >>expressed as a 16-digit string, or as an URI or anything else is not >>finally the point. >> >>The real power is in the associated metadata related to that identifier. >>While there is tremendous overhead in a centralized system, they are >>critically important in a well-functioning ID system. Without a >>controlling system, then there will be no standard set of associated >>metadata. Now, how well that metadata is created, managed, curated and >>controlled are open questions (as Laura certainly knows), but without >>some authority driving compliance than inevitably there will be an >>increasing divergence of metadata quality, practice and interoperability. >> >> >>Also to Ivan's question about work-level IDs, there is work being done >>by OCLC to develop a true FRBR Work-level identifier based on their >>data store of library's bibliographic data. This ID is derived by >>analysis of the collection once the items are released then catalogued. >>I am not certain that a similar level work ID would be possible in >>trade, outside of being done by the author, agent or rights manager to >>truly combine all of the works (in a FRBR sense) under a single ID. >>Identifying say, the hardcover book of a story, comic book version of >>that same story, the blue-ray DVD of that story, the broadway play of >>that story, and the swedish translation of the book into a single >>Work-level ID is only something that can be done after the fact, >>because their expressions are very, very different. The closest that we >>might come to identifying that pre-production is to ID the rights >>associated with a particular intellectual property. And while it may be >>useful in practice, I don't know it would be useful in application. >>Which, I expect in the end would only serve the purpose of making lots >>of IP lawyers very wealthy. >> >>Todd >> >> >> >> >>On Sep 25, 2014, at 5:07 AM, Koji Ishii <kojiishi@gluesoft.co.jp> wrote: >> >>> Maybe this was already discussed, but I'm in favor of a distributed >>>ID system than a single, central system. >>> >>> Take DNS. Or Java namespace. Their prefix comes from domain names >>>authors own, which is unique, then authors can define whatever the rest. >>>If a publisher wants to use ISBN, they could use, for instance, >>><epub://isbn-international.org/123456789>. >>> >>> Since what we want is to identify publications, as long as authors or >>>publications agree to use consistent domains/postfixes, I guess we can >>>guarantee the uniqueness. >>> >>> Maybe there are more use cases for the ID more than identifying >>>publications? Use cases I have in mind are for links between >>>publications and OA, these I think distributed system can do. >>> >>> /koji >>> >>> On Sep 25, 2014, at 12:51 PM, Ivan Herman <ivan@w3.org> wrote: >>> >>>> >>>> On 24 Sep 2014, at 23:14 , Laura Dawson <Laura.Dawson@bowker.com> >>>>wrote: >>>> >>>>> True. It's a cluttered road. >>>> >>>> We are in a really dangerous business! >>>> >>>> Ivan >>>> >>>>> >>>>> On 9/24/14, 5:12 PM, "David (Standards) Singer" <singer@apple.com> >>>>>wrote: >>>>> >>>>>> >>>>>> On Sep 24, 2014, at 12:16 , LAURA DAWSON <ljndawson@gmail.com> >>>>>>wrote: >>>>>> >>>>>>> Yes, Bowker were a DOI registration agency and I can tell you >>>>>>>that the associated systems and metadata were the primary reason >>>>>>>DOIs for trade books (as opposed to STEM/scholarly) never took >>>>>>>off. >>>>>>> >>>>>>> So you see, Ivan, the road to book URIs is littered with a couple >>>>>>> of corpses. >>>>>> >>>>>> It's not just books. I was on a project that needed something for >>>>>>recordings many years ago, and that road was also strewn with >>>>>>corpses. >>>>>> >>>>>>> >>>>>>> On 9/24/14, 3:13 PM, "Bill Kasdorf" <bkasdorf@apexcovantage.com> >>>>>>>wrote: >>>>>>> >>>>>>>> Actually, the DOI _is_ used for this, mainly by scholarly/STM >>>>>>>>publishers, as well as for chapters of books--typically one DOI >>>>>>>>for the book and a DOI for each chapter (and sometimes DOIs at >>>>>>>>even lower component levels, most often for figures and >>>>>>>>tables). And these are _agnostic_ as to format, they typically >>>>>>>>mean "the book" and "the chapter" in the abstract sense. When >>>>>>>>you click on one of these DOIs you are usually then given your >>>>>>>>choice of what format, whether you have access, how to obtain >>>>>>>>access, etc. >>>>>>>> >>>>>>>> But it requires the associated systems, metadata, registration >>>>>>>>agency, etc. to make it work. To belabor a point, though, in >>>>>>>>that context it does work. There are a gazillion of them. The >>>>>>>>whole scholarly/STM ecosystem is now dependent on DOIs. >>>>>>>> >>>>>>>> Those that use the DOI for this use CrossRef DOIs, which >>>>>>>>_should_ be expressed as URIs (and increasingly are). >>>>>>>> >>>>>>>> But all that is purely under the control of the publisher >>>>>>>>(including what the DOI links to and what that destination >>>>>>>>provides--not necessarily the content itself); it doesn't >>>>>>>>address "work" in the way librarians mean "work," and it >>>>>>>>requires the systems I mentioned (including the Handle system on >>>>>>>>which DOI is based). It would not work for our need to point to >>>>>>>>the "work itself" or some component of the work. So the answer in >>>>>>>>a purely standard web-world sense is still no. >>>>>>>> >>>>>>>> --Bill K >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Laura Dawson [mailto:Laura.Dawson@bowker.com] >>>>>>>> Sent: Wednesday, September 24, 2014 2:55 PM >>>>>>>> To: Ivan Herman; Graham Bell >>>>>>>> Cc: Laura Dawson; Phil Madans; Bill Kasdorf; W3C Public Digital >>>>>>>> Publishing IG Mailing List >>>>>>>> Subject: Re: As an aside, a possibly interesting read.... >>>>>>>> >>>>>>>> As it stands now, no. So a book's "home" on the web (regardless >>>>>>>>of >>>>>>>> edition) is not standardizable at this point unless you want to >>>>>>>>go down the DOI road (please let's not go down the DOI road). >>>>>>>> >>>>>>>> On 9/24/14, 4:13 AM, "Ivan Herman" <ivan@w3.org> wrote: >>>>>>>> >>>>>>>>> Thanks for all the interesting discussion... >>>>>>>>> >>>>>>>>> However: all this is to say that there does not seem to be any >>>>>>>>>existing (and viable) option to uniquely identify (preferably >>>>>>>>>through a >>>>>>>>>URI) a >>>>>>>>> 'work' (whether in the ISTC or the FRBR sense). Which is a >>>>>>>>>problem for metadata as well as for archiving. :-( Tell me I am >>>>>>>>>wrong, please... >>>>>>>>> >>>>>>>>> Ivan >>>>>>>>> >>>>>>>>> >>>>>>>>> On 24 Sep 2014, at 24:19 , Graham Bell <graham@editeur.org> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>>> And they can be treated this way in ONIX too. As I said, >>>>>>>>>> >>>>>>>>>>> they are not (strictly) an attribute of the ISBN, though they >>>>>>>>>>>may be presented as such in various systems >>>>>>>>>> >>>>>>>>>> G >>>>>>>>>> >>>>>>>>>> NB repeatable because the ISBN is associated directly with >>>>>>>>>>only one work, but can be indirectly associated (through that >>>>>>>>>>work) with several other works. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 23 Sep 2014, at 21:12, LAURA DAWSON wrote: >>>>>>>>>> >>>>>>>>>>> Yes, even at Bowker we made them a repeatable attribute on >>>>>>>>>>>the ISBN record. >>>>>>>>>>> >>>>>>>>>>> From: "Madans, Phil" <Phil.Madans@hbgusa.com> >>>>>>>>>>> Date: Tuesday, September 23, 2014 at 3:13 PM >>>>>>>>>>> To: Laura Dawson <ljndawson@gmail.com>, Graham Bell >>>>>>>>>>><graham@editeur.org>, Bill Kasdorf >>>>>>>>>>><bkasdorf@apexcovantage.com>, Ivan Herman <ivan@w3.org>, W3C >>>>>>>>>>>Public Digital Publishing IG Mailing List >>>>>>>>>>><public-digipub-ig-comment@w3.org> >>>>>>>>>>> Subject: Re: As an aside, a possibly interesting read.... >>>>>>>>>>> >>>>>>>>>>> I stand corrected on the assignment of the ISTC. Bad choice >>>>>>>>>>>of words. >>>>>>>>>>> I was speaking more on how I would have to manage them >>>>>>>>>>>internally on the systems level―that's how I think about >>>>>>>>>>>these things―and that would be as an attribute. That all >>>>>>>>>>>depends on how titles systems are structured, and I'm not >>>>>>>>>>>saying ours is the best way to do things, but I think the >>>>>>>>>>>way we do it is how most do it these days. From a practical >>>>>>>>>>>standpoint, I'm not sure how else I could handle them. IF I >>>>>>>>>>>publish an English and Spanish edition of a work, and the >>>>>>>>>>>ISTC's are different, then they would be attributes of the >>>>>>>>>>>ISBNs so that I could keep them linked internally. We are >>>>>>>>>>>already doing this, as is most everyone else, and I think >>>>>>>>>>>that is why the ISTC was such a hard sell. >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------ >>>>>>>>>>> Phil Madans | Executive Director of Digital Publishing >>>>>>>>>>>Technology | Hachette Book Group | 237 Park Avenue NY 10017 >>>>>>>>>>>|212-364-1415 | phil.madans@hbgusa.com >>>>>>>>>>> >>>>>>>>>>> From: LAURA DAWSON <ljndawson@gmail.com> >>>>>>>>>>> Date: Tuesday, September 23, 2014 at 2:22 PM >>>>>>>>>>> To: Graham Bell <graham@editeur.org>, Phil Madans >>>>>>>>>>><phil.madans@hbgusa.com>, Bill Kasdorf >>>>>>>>>>><bkasdorf@apexcovantage.com>, >>>>>>>>>>> Ivan Herman <ivan@w3.org>, W3C Public Digital Publishing IG >>>>>>>>>>>Mailing List <public-digipub-ig-comment@w3.org> >>>>>>>>>>> Subject: Re: As an aside, a possibly interesting read.... >>>>>>>>>>> >>>>>>>>>>> Bowker was an ISTC registration agency until recently. We >>>>>>>>>>>pulled out because of the lack of support in the US, and >>>>>>>>>>>refer the few curious to Nielsen. >>>>>>>>>>> >>>>>>>>>>> From: Graham Bell <graham@editeur.org> >>>>>>>>>>> Date: Tuesday, September 23, 2014 at 2:09 PM >>>>>>>>>>> To: Phil Madans <Phil.Madans@hbgusa.com>, Laura Dawson >>>>>>>>>>><ljndawson@gmail.com>, Bill Kasdorf >>>>>>>>>>><bkasdorf@apexcovantage.com>, >>>>>>>>>>> Ivan Herman <ivan@w3.org>, W3C Public Digital Publishing IG >>>>>>>>>>>Mailing List <public-digipub-ig-comment@w3.org> >>>>>>>>>>> Subject: Re: As an aside, a possibly interesting read.... >>>>>>>>>>> >>>>>>>>>>> What Phil and Laura have written certainly summarises -- and >>>>>>>>>>> illustrates -- the debate over identifiers. >>>>>>>>>>> >>>>>>>>>>> But the text below (from Phil) is a little misleading. >>>>>>>>>>> >>>>>>>>>>>> Whether an ISTC >>>>>>>>>>>> is a real work Identifier or not is a matter of debate. I >>>>>>>>>>>>disagree that ii is. It is actually an attribute of the >>>>>>>>>>>>ISBN―-hat is how they are assigned. >>>>>>>>>>>> Different ISBNs of the same master content might have >>>>>>>>>>>>different ISTC's. >>>>>>>>>>>> Translations for instance. >>>>>>>>>>> >>>>>>>>>>> The 'rules' of the ISTC say that translations are by >>>>>>>>>>>definition different works, and MUST have different ISTCs >>>>>>>>>>>(though those ISTCs will be related to each other -- one is a >>>>>>>>>>>'derived work', and this close relationship is recorded in >>>>>>>>>>>the registration metadata for the ISTCs themselves). This >>>>>>>>>>>contrasts with library practice, where 'work' >>>>>>>>>>> is something at a higher level and two translations are >>>>>>>>>>>actually termed two 'expressions' of the same 'work'. In >>>>>>>>>>>library terms, the ISTC is an expression identifier. See the >>>>>>>>>>>attached PDF (a slide from a training session that I deliver >>>>>>>>>>>fairly regularly) for a summary of how the <indecs> model on >>>>>>>>>>>which ISTC and ONIX are based compares with the FRBR library >>>>>>>>>>>model. There is -- as far as I know -- no public identifier >>>>>>>>>>>that works at the FRBR:work level, though libraries may have >>>>>>>>>>>internal IDs. >>>>>>>>>>> >>>>>>>>>>> And I'm pretty sure ISTCs can be assigned without an ISBN >>>>>>>>>>>(and without any product ID at all, in fact) -- they are not >>>>>>>>>>>(strictly) >>>>>>>>>>> an >>>>>>>>>>> attribute of the ISBN, though they may be presented as such >>>>>>>>>>>in various systems. >>>>>>>>>>> They can be registered based on a manuscript, prior to there >>>>>>>>>>>being a product. >>>>>>>>>>> >>>>>>>>>>> On the other hand, there's no doubt that ISTC has so far >>>>>>>>>>>proved unpopular among publishers, for some of the reasons >>>>>>>>>>>Laura and Phil list, and its actual usage is minimal. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Graham >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Graham Bell >>>>>>>>>>> EDItEUR >>>>>>>>>>> >>>>>>>>>>> Tel: +44 20 7503 6418 >>>>>>>>>>> Mob: +44 7887 754958 >>>>>>>>>>> >>>>>>>>>>> EDItEUR Limited is a company limited by guarantee, registered >>>>>>>>>>> in England no 2994705. Registered Office: United House, North >>>>>>>>>>> Road, London >>>>>>>>>>> N7 9DP, UK. Website: http://www.editeur.org >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This may contain confidential material. If you are not an >>>>>>>>>>>intended recipient, please notify the sender, delete >>>>>>>>>>>immediately, and understand that no disclosure or reliance on >>>>>>>>>>>the information herein is permitted. >>>>>>>>>>> Hachette Book Group may monitor email to and from our network. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ---- >>>>>>>>> Ivan Herman, W3C >>>>>>>>> Digital Publishing Activity Lead >>>>>>>>> Home: http://www.w3.org/People/Ivan/ >>>>>>>>> mobile: +31-641044153 >>>>>>>>> GPG: 0x343F1A3D >>>>>>>>> WebID: http://www.ivan-herman.net/foaf#me >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> David Singer >>>>>> Manager, Software Standards, Apple Inc. >>>>>> >>>>> >>>>> >>>> >>>> >>>> ---- >>>> Ivan Herman, W3C >>>> Digital Publishing Activity Lead >>>> Home: http://www.w3.org/People/Ivan/ >>>> mobile: +31-641044153 >>>> GPG: 0x343F1A3D >>>> WebID: http://www.ivan-herman.net/foaf#me >>>> >>>> >>>> >>>> >>>> >>> >>> >> >
Received on Thursday, 25 September 2014 15:07:00 UTC