Re: As an aside, a possibly interesting read.... from Todd Carpenter (Gmail) on 2014-09-25 (public-digipub@w3.org from September 2014)

From: Todd Carpenter (Gmail) <tcarpenter@niso.org>
Date: Thu, 25 Sep 2014 07:58:38 +0100
To: Koji Ishii <kojiishi@gluesoft.co.jp>
Cc: Ivan Herman <ivan@w3.org>, Laura Dawson <Laura.Dawson@bowker.com>, "David (Standards) Singer" <singer@apple.com>, Laura Dawson <ljndawson@gmail.com>, Bill Kasdorf <bkasdorf@apexcovantage.com>, Graham Bell <graham@editeur.org>, Phil Madans <Phil.Madans@hbgusa.com>, W3C Public Digital Publishing IG Mailing List <public-digipub-ig-comment@w3.org>
Message-Id: <D0B57788-2BAD-482F-8CDA-B290857EBA62@niso.org>
There is a tremendous problem with distributed systems when it comes to canonical information and standard identifiers.  That being the metadata that is associated with that identifier.  An identifier is (or better put should be) just a dumb (i.e., without embedded meaning), unique set of string of characters. The structure of that string, while systematically important is beside the point. Whether an identifier is expressed as a 16-digit string, or as an URI or anything else is not finally the point.

The real power is in the associated metadata related to that identifier. While there is tremendous overhead in a centralized system, they are critically important in a well-functioning ID system. Without a controlling system, then there will be no standard set of associated metadata.  Now, how well that metadata is created, managed, curated and controlled are open questions (as Laura certainly knows), but without some authority driving compliance than inevitably there will be an increasing divergence of metadata quality, practice and interoperability.  

Also to Ivan’s question about work-level IDs, there is work being done by OCLC to develop a true FRBR Work-level identifier based on their data store of library’s bibliographic data. This ID is derived by analysis of the collection once the items are released then catalogued. I am not certain that a similar level work ID would be possible in trade, outside of being done by the author, agent or rights manager to truly combine all of the works (in a FRBR sense) under a single ID.  Identifying say, the hardcover book of a story, comic book version of that same story, the blue-ray DVD of that story, the broadway play of that story, and the swedish translation of the book into a single Work-level ID is only something that can be done after the fact, because their expressions are very, very different. The closest that we might come to identifying that pre-production is to ID the rights associated with a particular intellectual property. And while it may be useful in practice, I don’t know it would be useful in application. Which, I expect in the end would only serve the purpose of making lots of IP lawyers very wealthy.   

Todd




On Sep 25, 2014, at 5:07 AM, Koji Ishii <kojiishi@gluesoft.co.jp> wrote:

> Maybe this was already discussed, but I’m in favor of a distributed ID system than a single, central system.
> 
> Take DNS. Or Java namespace. Their prefix comes from domain names authors own, which is unique, then authors can define whatever the rest. If a publisher wants to use ISBN, they could use, for instance, <epub://isbn-international.org/123456789>.
> 
> Since what we want is to identify publications, as long as authors or publications agree to use consistent domains/postfixes, I guess we can guarantee the uniqueness.
> 
> Maybe there are more use cases for the ID more than identifying publications? Use cases I have in mind are for links between publications and OA, these I think distributed system can do.
> 
> /koji
> 
> On Sep 25, 2014, at 12:51 PM, Ivan Herman <ivan@w3.org> wrote:
> 
>> 
>> On 24 Sep 2014, at 23:14 , Laura Dawson <Laura.Dawson@bowker.com> wrote:
>> 
>>> True. It’s a cluttered road.
>> 
>> We are in a really dangerous business!
>> 
>> Ivan
>> 
>>> 
>>> On 9/24/14, 5:12 PM, "David (Standards) Singer" <singer@apple.com> wrote:
>>> 
>>>> 
>>>> On Sep 24, 2014, at 12:16 , LAURA DAWSON <ljndawson@gmail.com> wrote:
>>>> 
>>>>> Yes, Bowker were a DOI registration agency and I can tell you that the
>>>>> associated systems and metadata were the primary reason DOIs for trade
>>>>> books (as opposed to STEM/scholarly) never took off.
>>>>> 
>>>>> So you see, Ivan, the road to book URIs is littered with a couple of
>>>>> corpses.
>>>> 
>>>> It’s not just books.  I was on a project that needed something for
>>>> recordings many years ago, and that road was also strewn with corpses.
>>>> 
>>>>> 
>>>>> On 9/24/14, 3:13 PM, "Bill Kasdorf" <bkasdorf@apexcovantage.com> wrote:
>>>>> 
>>>>>> Actually, the DOI _is_ used for this, mainly by scholarly/STM
>>>>>> publishers,
>>>>>> as well as for chapters of books--typically one DOI for the book and a
>>>>>> DOI for each chapter (and sometimes DOIs at even lower component
>>>>>> levels,
>>>>>> most often for figures and tables). And these are _agnostic_ as to
>>>>>> format, they typically mean "the book" and "the chapter" in the
>>>>>> abstract
>>>>>> sense. When you click on one of these DOIs you are usually then given
>>>>>> your choice of what format, whether you have access, how to obtain
>>>>>> access, etc.
>>>>>> 
>>>>>> But it requires the associated systems, metadata, registration agency,
>>>>>> etc. to make it work. To belabor a point, though, in that context it
>>>>>> does
>>>>>> work. There are a gazillion of them. The whole scholarly/STM ecosystem
>>>>>> is
>>>>>> now dependent on DOIs.
>>>>>> 
>>>>>> Those that use the DOI for this use CrossRef DOIs, which _should_ be
>>>>>> expressed as URIs (and increasingly are).
>>>>>> 
>>>>>> But all that is purely under the control of the publisher (including
>>>>>> what
>>>>>> the DOI links to and what that destination provides--not necessarily
>>>>>> the
>>>>>> content itself); it doesn't address "work" in the way librarians mean
>>>>>> "work," and it requires the systems I mentioned (including the Handle
>>>>>> system on which DOI is based). It would not work for our need to point
>>>>>> to
>>>>>> the "work itself" or some component of the work. So the answer in a
>>>>>> purely standard web-world sense is still no.
>>>>>> 
>>>>>> --Bill K
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Laura Dawson [mailto:Laura.Dawson@bowker.com]
>>>>>> Sent: Wednesday, September 24, 2014 2:55 PM
>>>>>> To: Ivan Herman; Graham Bell
>>>>>> Cc: Laura Dawson; Phil Madans; Bill Kasdorf; W3C Public Digital
>>>>>> Publishing IG Mailing List
>>>>>> Subject: Re: As an aside, a possibly interesting read....
>>>>>> 
>>>>>> As it stands now, no. So a book's "home" on the web (regardless of
>>>>>> edition) is not standardizable at this point unless you want to go down
>>>>>> the DOI road (please let's not go down the DOI road).
>>>>>> 
>>>>>> On 9/24/14, 4:13 AM, "Ivan Herman" <ivan@w3.org> wrote:
>>>>>> 
>>>>>>> Thanks for all the interesting discussion...
>>>>>>> 
>>>>>>> However: all this is to say that there does not seem to be any
>>>>>>> existing
>>>>>>> (and viable) option to uniquely identify (preferably through a URI) a
>>>>>>> 'work' (whether in the ISTC or the FRBR sense). Which is a problem for
>>>>>>> metadata as well as for archiving. :-( Tell me I am wrong, please...
>>>>>>> 
>>>>>>> Ivan
>>>>>>> 
>>>>>>> 
>>>>>>> On 24 Sep 2014, at 24:19 , Graham Bell <graham@editeur.org> wrote:
>>>>>>> 
>>>>>>>> And they can be treated this way in ONIX too. As I said,
>>>>>>>> 
>>>>>>>>> they are not (strictly) an attribute of the ISBN, though they may be
>>>>>>>>> presented as such in various systems
>>>>>>>> 
>>>>>>>> G
>>>>>>>> 
>>>>>>>> NB repeatable because the ISBN is associated directly with only one
>>>>>>>> work, but can be indirectly associated (through that work) with
>>>>>>>> several other works.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 23 Sep 2014, at 21:12, LAURA DAWSON wrote:
>>>>>>>> 
>>>>>>>>> Yes, even at Bowker we made them a repeatable attribute on the ISBN
>>>>>>>>> record.
>>>>>>>>> 
>>>>>>>>> From: "Madans, Phil" <Phil.Madans@hbgusa.com>
>>>>>>>>> Date: Tuesday, September 23, 2014 at 3:13 PM
>>>>>>>>> To: Laura Dawson <ljndawson@gmail.com>, Graham Bell
>>>>>>>>> <graham@editeur.org>, Bill Kasdorf <bkasdorf@apexcovantage.com>,
>>>>>>>>> Ivan
>>>>>>>>> Herman <ivan@w3.org>, W3C Public Digital Publishing IG Mailing List
>>>>>>>>> <public-digipub-ig-comment@w3.org>
>>>>>>>>> Subject: Re: As an aside, a possibly interesting read....
>>>>>>>>> 
>>>>>>>>> I stand corrected on the assignment of the ISTC. Bad choice of
>>>>>>>>> words.
>>>>>>>>> I was speaking more on how I would have to manage them internally on
>>>>>>>>> the systems level―that's how I think about these things―and that
>>>>>>>>> would be as an attribute.  That  all depends on how titles systems
>>>>>>>>> are structured, and I'm not saying ours is the best way to do
>>>>>>>>> things,
>>>>>>>>> but I think the way we do it is how most do it these days. From a
>>>>>>>>> practical standpoint, I'm not sure how else I could handle them. IF
>>>>>>>>> I
>>>>>>>>> publish an English and Spanish edition of a work, and the ISTC's are
>>>>>>>>> different, then they would be attributes of the ISBNs so that I
>>>>>>>>> could
>>>>>>>>> keep them linked internally.  We are already doing this, as is most
>>>>>>>>> everyone else, and I think that is why the ISTC was such a hard
>>>>>>>>> sell.
>>>>>>>>> 
>>>>>>>>> ------------------------------------------------------------
>>>>>>>>> Phil Madans | Executive Director of Digital Publishing Technology |
>>>>>>>>> Hachette Book Group | 237 Park Avenue NY 10017 |212-364-1415 |
>>>>>>>>> phil.madans@hbgusa.com
>>>>>>>>> 
>>>>>>>>> From: LAURA DAWSON <ljndawson@gmail.com>
>>>>>>>>> Date: Tuesday, September 23, 2014 at 2:22 PM
>>>>>>>>> To: Graham Bell <graham@editeur.org>, Phil Madans
>>>>>>>>> <phil.madans@hbgusa.com>, Bill Kasdorf <bkasdorf@apexcovantage.com>,
>>>>>>>>> Ivan Herman <ivan@w3.org>, W3C Public Digital Publishing IG Mailing
>>>>>>>>> List <public-digipub-ig-comment@w3.org>
>>>>>>>>> Subject: Re: As an aside, a possibly interesting read....
>>>>>>>>> 
>>>>>>>>> Bowker was an ISTC registration agency until recently. We pulled out
>>>>>>>>> because of the lack of support in the US, and refer the few curious
>>>>>>>>> to Nielsen.
>>>>>>>>> 
>>>>>>>>> From: Graham Bell <graham@editeur.org>
>>>>>>>>> Date: Tuesday, September 23, 2014 at 2:09 PM
>>>>>>>>> To: Phil Madans <Phil.Madans@hbgusa.com>, Laura Dawson
>>>>>>>>> <ljndawson@gmail.com>, Bill Kasdorf <bkasdorf@apexcovantage.com>,
>>>>>>>>> Ivan Herman <ivan@w3.org>, W3C Public Digital Publishing IG Mailing
>>>>>>>>> List <public-digipub-ig-comment@w3.org>
>>>>>>>>> Subject: Re: As an aside, a possibly interesting read....
>>>>>>>>> 
>>>>>>>>> What Phil and Laura have written certainly summarises -- and
>>>>>>>>> illustrates -- the debate over identifiers.
>>>>>>>>> 
>>>>>>>>> But the text below (from Phil) is a little misleading.
>>>>>>>>> 
>>>>>>>>>> Whether an ISTC
>>>>>>>>>> is a real work Identifier or not is a matter of debate. I disagree
>>>>>>>>>> that ii  is. It is actually an attribute of the ISBN―-hat is how
>>>>>>>>>> they are assigned.
>>>>>>>>>> Different ISBNs of the same master content might have different
>>>>>>>>>> ISTC's.
>>>>>>>>>> Translations for instance.
>>>>>>>>> 
>>>>>>>>> The 'rules' of the ISTC say that translations are by definition
>>>>>>>>> different works, and MUST have different ISTCs (though those ISTCs
>>>>>>>>> will be related to each other -- one is a 'derived work', and this
>>>>>>>>> close relationship is recorded in the registration metadata for the
>>>>>>>>> ISTCs themselves). This contrasts with library practice, where
>>>>>>>>> 'work'
>>>>>>>>> is something at a higher level and two translations are actually
>>>>>>>>> termed two 'expressions' of the same 'work'. In library terms, the
>>>>>>>>> ISTC is an expression identifier. See the attached PDF (a slide from
>>>>>>>>> a training session that I deliver fairly regularly) for a summary of
>>>>>>>>> how the <indecs> model on which ISTC and ONIX are based compares
>>>>>>>>> with
>>>>>>>>> the FRBR library model. There is -- as far as I know -- no public
>>>>>>>>> identifier that works at the FRBR:work level, though libraries may
>>>>>>>>> have internal IDs.
>>>>>>>>> 
>>>>>>>>> And I'm pretty sure ISTCs can be assigned without an ISBN (and
>>>>>>>>> without any product ID at all, in fact) -- they are not (strictly)
>>>>>>>>> an
>>>>>>>>> attribute of the ISBN, though they may be presented as such in
>>>>>>>>> various
>>>>>>>>> systems.
>>>>>>>>> They can be registered based on a manuscript, prior to there being a
>>>>>>>>> product.
>>>>>>>>> 
>>>>>>>>> On the other hand, there's no doubt that ISTC has so far proved
>>>>>>>>> unpopular among publishers, for some of the reasons Laura and Phil
>>>>>>>>> list, and its actual usage is minimal.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Graham
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Graham Bell
>>>>>>>>> EDItEUR
>>>>>>>>> 
>>>>>>>>> Tel: +44 20 7503 6418
>>>>>>>>> Mob: +44 7887 754958
>>>>>>>>> 
>>>>>>>>> EDItEUR Limited is a company limited by guarantee, registered in
>>>>>>>>> England no 2994705. Registered Office: United House, North Road,
>>>>>>>>> London
>>>>>>>>> N7 9DP, UK. Website: http://www.editeur.org
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This may contain confidential material. If you are not an intended
>>>>>>>>> recipient, please notify the sender, delete immediately, and
>>>>>>>>> understand that no disclosure or reliance on the information herein
>>>>>>>>> is
>>>>>>>>> permitted.
>>>>>>>>> Hachette Book Group may monitor email to and from our network.
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ----
>>>>>>> Ivan Herman, W3C
>>>>>>> Digital Publishing Activity Lead
>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>> mobile: +31-641044153
>>>>>>> GPG: 0x343F1A3D
>>>>>>> WebID: http://www.ivan-herman.net/foaf#me
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> David Singer
>>>> Manager, Software Standards, Apple Inc.
>>>> 
>>> 
>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C 
>> Digital Publishing Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> GPG: 0x343F1A3D
>> WebID: http://www.ivan-herman.net/foaf#me
>> 
>> 
>> 
>> 
>> 
> 
>
Received on Thursday, 25 September 2014 07:23:36 UTC