W3C home > Mailing lists > Public > public-digipub-ig@w3.org > September 2014

Re: [METADATA] Governance/authority (ISSUE-2)

From: Liam R E Quin <liam@w3.org>
Date: Mon, 15 Sep 2014 19:15:40 -0400
To: LAURA DAWSON <ljndawson@gmail.com>
Cc: "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com>, Ivan Herman <ivan@w3.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>, Bill Kasdorf <bkasdorf@apexcovantage.com>, Madi Weland Solomon <madi.solomon@pearson.com>
Message-ID: <20140915191540.3f4e9b18.liam@w3.org>
On Mon, 15 Sep 2014 18:11:39 -0400
LAURA DAWSON <ljndawson@gmail.com> wrote:

> Yes, and I wouldn¹t expect the retail sites to change that. How they
> ingest data and express it on their sites is at the core of the value each
> retailer brings to the table 


> Page count is another one of those troublesome fields. :)

I have my trusty copy of McKerrow on hand for bibliography and citing collations :-)

But you are right.

Clearly metadata for ebooks (and for Web sites that may be used offline) has several properties that are needed... some likely examples:

* mixture of embeddable metadata (title, author, category/facets),
  updatable metadata (price, latest edition...) and
  pointers to remotely updatable metadata (BCIP, OCLC, Amazon category names).

* Retraction (e.g. ability for publisher to correct errors, or when a category
  is split so that Shelf Zero now has occult and computing separately instead of
  intermixed (real example!)

* Marking of every item with source and date - e.g., according to Allen & Unwin
  colume 3 of Lord of the Rings goes in Adult Fiction, as of such-and-such a date;
  if you extract single "triples" and use them out of context you're asking for
  a mess.

* Handling complex "fields" - e.g. book or journal titles in mathematics
  often contain formulæ; in the humanities you'll get titles with fragments from multiple languages (Nielsen and other metadata organizations notwithstanding). The "CDATA excaping" mechanism for this is lunacy.

* User-supplied metadata (e.g. "this book is really about computers, not the occult,
  and that's where I want it on my virtual shelf" or "sort this book under "R" for
  "Really Hard", not under "M" for "McKerrow"... although the ebook systems I've
  seen have enough trouble ignoring "The" when sorting titles...)

Onyx seems to be (1) complex enough to handle these cases, with work in some areas perhaps, and (2) complex enough to make a lot of people run screaming. But then, try going to the middle of your friendly reference library and saying, "can anyone help me, I want to have a friendly chat about MARC records" :-) and one has to remember that, just as XML tends to be used at the boundary between the "world" and the computer, metadata is used by people who are metadata experts but not necessarily computer experts.

Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Received on Monday, 15 September 2014 23:15:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:35:52 UTC