- From: <Manuel.CARRASCO-BENITEZ@ec.europa.eu>
- Date: Thu, 13 Aug 2015 08:52:29 +0000
- To: <amgreiner@lbl.gov>
- CC: <phila@w3.org>, <mark.harrison@cantab.net>, <public-dwbp-wg@w3.org>
Yes, I am: the original data should always be made available. In addition to more more appropriate formats for further processing such as XML or plain text: just variants of one resource. Tomas ________________________________________ From: Annette Greiner [amgreiner@lbl.gov] Sent: 12 August 2015 19:31 To: CARRASCO BENITEZ Manuel (DGT) Cc: phila@w3.org; mark.harrison@cantab.net; public-dwbp-wg@w3.org Subject: Re: Use machine-readable standardized data formats / Use non-proprietary data formats You’re not seriously suggesting people should make data available in word perfect format, are you? This discussion seems to be wandering into the realm of publishing documents. -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory 510-495-2935 On Aug 12, 2015, at 7:28 AM, Manuel.CARRASCO-BENITEZ@ec.europa.eu wrote: > One should have at least the following variants of the resource: > > - Original : foo.wp - WordPerfect 3.0 ~1982, perhaps still processable > - Content : foo.txt - textual, hopefully processable in 100 years > - Presentation : foo.tif - TIFF ~1986, perhaps still viewable, might be foo.ps > > So: > - http://example.com/foo - negotiate and give me the best > - http://example.com/foo.wp - I can still process WP > - http://example.com/foo.txt - I want to process the text, no presentation > - http://example.com/foo.tif - I really want to see how the doc looks > > Regards > Tomas > >> Perhaps the way we can formulate this is to say that some document >> formats (such as PDF, .doc / .docx and even .xls / .xlsx ) are >> concerned with presentation of information in a particular format or >> layout and therefore carry a significant amount of typesetting / >> formatting information overhead in addition to the underlying data. >> Furthermore, at the time those document-centric formats were >> developed, ease of access to the underlying data and the unambiguous >> meaning of specific data fields might not have been the main priority >> in their design. >> >> When the main priority is to ensure that the underlying data is >> available on the web so that others can re-use it, we recommend using >> simpler data formats such as CSV, TSV, JSON (or better still JSON-LD), >> RDF or XML. >
Received on Thursday, 13 August 2015 08:53:01 UTC