- From: Deborah Kaplan <dkaplan@safaribooksonline.com>
- Date: Tue, 8 Sep 2015 18:12:35 -0400
- To: Bill McCoy <bmccoy@idpf.org>
- Cc: Liam Quin <liam@w3.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
- Message-ID: <CANSiVPaZ0z9LtkhiLdnvooqDNa5+1kDLm9Fg+hQ41tB6dQFb9w@mail.gmail.com>
Olaf Drümmer wrote: > Nonetheless I would keep curation out of the text for the definitions, and condense it into 'intended'. Joseph Beuys (German artist) once put a pile of grease somewhere and intended it to be a > work of art (not sure how much curation went on while he was doing it, at least it didn't turn into cheese). Some cleaning person did not get the message and… Anyway: that pile of grease would > have to be considered a document, its portability only limited by climate/temperature ;-). If Beuys had incidentally dropped a same shaped and same sized pile of grease, it would not have been > a document. I am comfortable changing the term; "curated" has a jargon meaning in museums, libraries, and archives, and outside of that environment may have different connotations. >Bill McCoy <bmccoy@idpf.org> said: > A computer program to me can validly produce anything we consider a "Portable Web Document". For example a realization of my monthly bank statement will be a document, but it is not curated by a human. Far up this now lengthy thread (mea culpa!) I discussed how curation by computer is very much a form of curation. Humans with intent created the tool which generated the monthly bank statement. The bank statement itself it simply a serialized view of some cells in your bank's data tables, but the choice to create that *specific* view of those cells -- and your choice to have your bank generate the PDF or paper, instead of quietly trusting Quicken to make some background transactions while it updates its own local database -- is what creates a document. (As Olaf has also said, much more succinctly.) Bill McCoy <bmccoy@idpf.org> said: > If an online calendar is simply a UX over a database then I don't consider it a "document" (whether or not the calendar entries have been curated). But if the calendar system can produce a PDF representation of the calendar, that would be a portable document (but not a "portable web document"). > > Similarly if you search on Google for "influenza" the results on the left (the search results) are in no way a "web document" (IMO), the sidebar on the right (with navigation via tabs) could be considered a "web document" but is not a "portable web document" - and whether it's truly a web document could be debated. The PDF that is generated is certainly a portable document (but not a portable "web" document, as I understand that term). But whether the content of the sidebar was in the first place human-curated or machine generated via semantic processing to me is not decisive as to whether it should be considered a "web document", and certainly not as to whether the PDF should be considered a "portable document". In fact I don't know the answer. So thus "document-ness", at least to me, has nothing directly to do with human curation. [and then in a second email] > Could an entire git repository a document (in the sense we mean for this activity)? I don't think so. Could a particular snapshot (e.g. current mainline or a named release) of a git repository >From an information science POV, an entire git repository -- or a calendar, or a collection of search results, or a search algorithm -- can absolutely be documents. The dependency is not whether they can be turned into a PDF or and HTML representation: digital paper, as it were -- just as a text with embedded video can be a document, or tablet-based interactive picturebooks. The dependency is whether the object as it stands is being treated as a document. Places where this has real digital publication ramifications in the academy include: - In digital theses and dissertations, when a student is required to deposit the documents of his doctoral work in an electronic thesis and dissertation database as a graduation requirement -- and the documents are composed of software products, chemical formulae, or datasets. - In an archives, when a scholar deposits her life's research, including her academic papers, her patented algorithm, several boxes of papers and ephemera, petabytes of data, the export of her Microsoft Outlook mailbox, and her award-winning website with interactive visualizations of her findings. The author writing about that scholar's life work interacts with each of these items in the archives, described and catalogued as a document, and analyzes each one critically as a complete document. - In a records management department, in an era where paper or even PDF rules and regulations have given way to micro-updates of websites, so the recordkeepers must record snapshots of entire web heirarchies as the documents recording the institution's history, later to be published in an online index for the board of directors. What makes each of these a "document" is that humans need to understand each as a concrete whole. It's not the technology of curation that matters -- indeed, in the third example, an automated spider run by the internet archive does the trick. It's the choice to view the parts as a "document" -- to view a dynamic website as a procedures manual, to view a running computer program as a dissertation. Suzanne Briet was the French scholar who came up with the lovely, evocative antelope example: "An antelope running wild on the plains of Africa should not be considered a document... But if it were to be captured, taken to a zoo and made an object of study, it has been made into a document. It has become physical evidence being used by those who study it. Indeed, scholarly articles written about the antelope are secondary documents, since the antelope itself is the primary document." Deborah
Received on Tuesday, 8 September 2015 22:13:04 UTC