W3C home > Mailing lists > Public > public-digipub-ig@w3.org > December 2013

RE: [metadata] FYI: BIBTEX Update at the LoC

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Sat, 7 Dec 2013 09:53:55 +0000
To: Tim Clark <tim_clark@harvard.edu>, Ivan Herman <ivan@w3.org>
CC: W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <5d91d9ded37b470fb0c6346c2533178e@CO2PR06MB572.namprd06.prod.outlook.com>
The key issue re bibliographic metadata for scientific journal publishing is CrossRef metadata and the DOI, which provide cross-publisher linking and other services (identification of most recent version via CrossMark, plagiarism detection, etc.). It is essential and ubiquitous in the scholarly journal space, and now increasingly used for scholarly books (CrossRef already has millions of book DOIs, at both the title and chapter level . . . and a gazillion, maybe a gazillion and a half, journal DOIs). These CrossRef DOIs appear in most citations of journal articles, and some publishers refresh their citations frequently to capture newly registered articles that are cited in already-published articles that didn't have DOIs when those articles were originally published. This is important for both the publishing and library worlds. CrossRef has a basic set of required metadata that enables DOI registration and link resolution, and accommodates much more metadata than the required minimum.-Bill Kasdorf

From: Tim Clark [mailto:tim_clark@harvard.edu]
Sent: Thursday, December 05, 2013 8:09 AM
To: Ivan Herman
Cc: W3C Digital Publishing IG
Subject: Re: [metadata] FYI: BIBTEX Update at the LoC

Agree this effort  is entirely and importantly relevant, and there are others such as CiTO the citation ontology,  as well.  I actually don't see any particular separation - there is a minimum an intersection.

If you look at scientific journal publishing, what is the difference between bibliographic info at publisher's website and at for example, NLM (National Library of Medicine)?

NLM has in addition to the "pure" bibliographic metadata, a lot of search-oriented stuff like MeSH terms; the abstracts; and interesting sort of "hidden" metadata like "most similar to what other publications".

No doubt publishers have a lot of process-oriented metadata, and there is likely other stuff I know nothing about.  But at least there is an important intersection set between libraries and publishers. Front matter of books always have ISBN, LOC or Brit Lib catalog number, etc. and you can expand out on common stuff from there.

Tim Clark

Director, Biomedical Informatics Core, Massachusetts General Hospital
Instructor in Neurology, Harvard Medical School

On Dec 5, 2013, at 7:44 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

I am not sure this is directly relevant to the Metadata Task Force discussion, but it may be of interest nevertheless:


contains a fairly long video on LoC's BIBTEX initiative. Yes, it is library metadata, not publishers' metadata, but I guess one of the challenges in general is how to bring those together.

Eric Miller, who is one of the developers (and, actually, who led the Semantic Web Activity at W3C until 2007) makes a very high level case for the usage of a BIBTEX-like structure (starting around 49:00 in the video). His talk lacks technical details for my taste, but I guess that was the nature of the audience...


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
FOAF: http://www.ivan-herman.net/foaf

The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.
Received on Saturday, 7 December 2013 09:54:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:35:47 UTC