- From: Robert Sanderson <azaroth42@gmail.com>
- Date: Mon, 15 Sep 2014 13:29:29 -0700
- To: Tom De Nies <tom.denies@ugent.be>
- Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>
- Message-ID: <CABevsUEn_pK3OSSxMmVEZoeftNhKN8xi3LW=u243yA+h1BMOBw@mail.gmail.com>
Like with CFI? And Open Annotation? :) http://www.idpf.org/epub/oa/ EPUBs are relatively straight forward in comparison to other content types, in terms of referencing arbitrary components. On Mon, Sep 15, 2014 at 1:25 PM, Tom De Nies <tom.denies@ugent.be> wrote: > You make a valid point, Phil, but the alternative (not embedding the > metadata) is not ideal either. > Without being able to directly refer to certain parts of the content of an > epub, the possibilities to add descriptive metadata decrease significantly. > > Ideally, you would be able to identify each part/fragment of an epub > individually (e.g., with a fragment URI), so you can describe it with its > metadata somewhere else. > > Tom > > > 2014-09-15 20:03 GMT+02:00 Madans, Phil <Phil.Madans@hbgusa.com>: > >> I get very nervous when I hear talk about including metadata in the >> epub file, like embedding ONIX or some other standard. The issue is that >> metadata changes. If you are embedding metadata in the epubs then you get >> into the position of having to generate and distribute new epub files every >> time that metadata changes. I don’t know how many publishers would be eager >> to do that. We wouldn’t. And I don’t think our vendors would be too keen >> on that either. >> >> >> >> Once an ebook publishes, a lot of the metadata probably isn’t going to >> change: Title, Author, Imprint, etc. But the descriptive metadata that we >> are looking for to aid in discovery is far less static: Keywords, subject >> categories, descriptions, awards, quotes, author bios and, of course, >> price. These elements can change often. >> >> >> >> Embedding metadata in the epub file, to me, is trying to do for the epub >> what the book jacket does for the physical product. The book jacket is >> about marketing, discoverability. It has all of those elements, like >> author bio and quotes and subjects categories, etc. And it is wrapped right >> around the content—and is also embedded in the content in the form of ad >> pages. The problem is the only time we can update with new metadata is when >> we reprint the book and/or the jacket, unless we want to sticker existing >> stock. In the same way, I don’t think embedding metadata in the epub is >> going to be a dynamic or flexible enough solution for getting the most bang >> out of the metadata. Unless there is a constant regeneration of the epub, >> which, again, I think will turn into a supply chain issue. >> >> >> >> That’s my opinion. >> >> >> >> Phil >> >> >> >> ------------------------------------------------------------ >> >> Phil Madans | Executive Director of Digital Publishing Technology >> | Hachette Book Group | 237 Park Avenue NY 10017 |212-364-1415 | >> phil.madans@hbgusa.com <david.young@hbgusa.com> >> >> >> >> *From:* Bill Kasdorf [mailto:bkasdorf@apexcovantage.com] >> *Sent:* Monday, September 15, 2014 10:01 AM >> *To:* Graham Bell; Ivan Herman; Tzviya Siegman >> *Cc:* W3C Digital Publishing IG; Madi Solomon >> *Subject:* RE: [METADATA] Webbiness of publishing metadata (ISSUE-1) >> >> >> >> +1 >> >> >> >> This is exactly what I was going to say but Graham beat me to the punch. >> ;-) >> >> >> >> Especially his comment that "it is not an issue that most publishers are >> even aware of." >> >> >> >> I want to especially emphasize the point that I think the Web should _ >> *enable*_ the expression and conveyance of metadata, not specify what >> that metadata _*is*_. >> >> >> >> Both schema.org and URIs are useful cases in point. >> >> >> >> Schema.org provides a useful way to embed metadata in content, but I >> would say it is somewhat halfway on the "enable don't specify" path. It >> does specify properties (which is actually very helpful) but in many or >> most case the actual vocabularies used to characterize those properties are >> not specified. While specifying down to that level of detail is of course >> very useful for interoperability, it tends to be too limiting, too >> restrictive, not expressive enough (have I been redundant enough?) for most >> specific communities of users. Thus the educational folks got a few of the >> things they need, the accessibility folks got a few of the things they >> need, etc.—both got _*subsets*_ of the vocabularies they really consider >> important within their domains. So I think on balance it is very useful to >> let those properties be described by whatever vocabularies are useful to a >> certain community of users. >> >> >> >> My example for URI is the DOI. ;-) It is not a choice _*between*_ using >> DOI or URI: the recommended practice is to _*express*_ a DOI in the _ >> *form*_ of a URI. While that was not the common practice at first, it >> has been recommended for the past year or two and is increasingly being >> done. Many identifiers can be expressed in the form of a URI, which I think >> is a Very Good Thing. URI doesn't attempt to _*replace*_ those >> identifiers, it makes them work better. >> >> >> >> --Bill K >> >> >> >> *From:* Graham Bell [mailto:graham@editeur.org <graham@editeur.org>] >> *Sent:* Monday, September 15, 2014 5:18 AM >> *To:* Ivan Herman; Tzviya Siegman >> *Cc:* W3C Digital Publishing IG; Bill Kasdorf; Madi Solomon >> *Subject:* Re: [METADATA] Webbiness of publishing metadata (ISSUE-1) >> >> >> >> I think it would be fair to say that the use of linked data and URIs as >> identifiers is "*definitely not a 'solved issue' among publishers*" -- >> and to a large extent is not an issue that most publishers are even aware >> of. While the book industry provides a fair amount of useful metadata, this >> metadata is not aimed at making the web more useful, but at making the >> supply chain for commercial books and e-books more useful. >> >> >> >> I go back to the three cases I listed in a comment on the DPIG wiki (see >> the Phase 1 Strategy section). >> >> >> >> i. metadata delivered in bulk, separate from the content or resource itself (*eg* as part of the commercial supply chain) >> >> ii. metadata delivered embedded within the content or resource it describes (*eg* within an EPUB, within a web page) >> >> iii. metadata delivered embedded within web pages *describing* the content or resource (*eg* in an online store, repository or catalog), possibly separate from the metadata *displayed* (for humans) on those pages >> >> (actually there is a fourth case, which is metadata delivered on >> demand, separate from the content or resource (eg as part of a web service). >> >> >> >> Publishers have tackled case i. via ONIX, but not case ii. or iii. Case >> ii is properly the domain of the content standards groups such as W3C DPIG >> and IDPF. Case iii. may also be something where W3C DPIG and schema.org >> have roles. But... >> >> >> >> Given the reluctance of book publishers and retailers to invest more in >> metadata (*viz* lack of uptake of a work identifier like ISTC, lack of >> interest in a release identifier analogous to GRID, slow migration to ONIX >> 3.0 in countries where 2.1 was most firmly embedded…), it seems to me to be >> critical that we don't further burden the industry with 'yet another data >> format to ignore'. As Phil implies in his point 5, the important thing is >> to have *good metadata*, and it doesn't much matter how it is expressed >> – so long as it can be transformed from one expression to another easily >> and without loss of meaning. I suspect the best way around this is to >> retain as much of the semantics of ONIX, while thinking about a syntax that >> would allow that metadata to be embedded in e-publications and online >> content. This would avoid publishers having to manage two or three parallel >> and distinct sets of metadata. Separating ONIX semantics ('what do we mean >> by pub date, by imprint, by title?') from the XML message (which is >> 'merely' a convenient syntax used for transmitting the data along a data >> supply chain in bulk), and allowing ONIX-style data to be expressed in >> other syntaxes or data formats seems (to me) to be the way to go. >> >> >> >> I think there is something significant to do, but let's not be >> reinventing the wheel. >> >> >> >> Graham Bell >> >> EDItEUR >> >> >> >> Tel: +44 20 7503 6418 >> >> Mob: +44 7887 754958 >> >> >> >> EDItEUR Limited is a company limited by guarantee, registered in England >> no 2994705. Registered Office: United House, North Road, London N7 9DP, >> UK. Website: http://www.editeur.org >> >> >> >> >> >> >> >> >> >> On 15 Sep 2014, at 10:59, Ivan Herman wrote: >> >> >> >> Hi Tzviya >> >> I try to clarify the issues you raised... >> >> the description of ISSUE-1[1] is currently empty. (It only has a title, >> in the subject of this mail). >> >> My interpretation of your question: is the published metadata >> web-friendly? For me, with my W3C/OWP goggle on, this means whether it is >> easy to use and combine metadata around a (or a family of) publication. >> With my former Semantic Web hat's google on this time, this is very much >> related to the essence of RDF: forgetting about the arcane syntax of >> RDF/XML, the various choices that have been made in its design, the real >> advantage of RDF is the ability to combine (meta)data coming from different >> sources. And the core of this is: use URI-s as unique identifiers wherever >> it makes sense and is useful. >> >> So... is the usage of URI-s around publishing metadata a solved issue? I >> have the *impression* the answer is no (but Laura D. may shoot me.) If not, >> is there anything W3C can do around this? Honestly, I do not think so, it >> may be just as a complex task as defining a unified vocabulary to rule them >> all... Is there a way to at least help? Years ago a document was produced >> in the semantic web domain called 'Cool URI-s for the Semantic Web'[2]; >> would it be of any help if we tried to do something similar? >> >> But I may completely misunderstand the issue. >> >> Ivan >> >> P.S. That being said, I would think that this whole issue SHOULD be >> listed in the metadata document we produce, spelling it out clearly. >> >> [1] https://www.w3.org/dpub/IG/track/issues/1 >> [2] http://www.w3.org/TR/cooluris/ >> >> ---- >> Ivan Herman, W3C >> Digital Publishing Activity Lead >> Home: http://www.w3.org/People/Ivan/ >> mobile: +31-641044153 >> GPG: 0x343F1A3D >> WebID: http://www.ivan-herman.net/foaf#me >> >> >> >> >> This may contain confidential material. If you are not an intended >> recipient, please notify the sender, delete immediately, and understand >> that no disclosure or reliance on the information herein is permitted. >> Hachette Book Group may monitor email to and from our network. >> > > -- Rob Sanderson Technology Collaboration Facilitator Digital Library Systems and Services Stanford, CA 94305
Received on Monday, 15 September 2014 20:29:59 UTC