Re: Use machine-readable standardized data formats / Use non-proprietary data formats

You’re not seriously suggesting people should make data available in word perfect format, are you?
This discussion seems to be wandering into the realm of publishing documents.

--
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
510-495-2935

On Aug 12, 2015, at 7:28 AM, Manuel.CARRASCO-BENITEZ@ec.europa.eu wrote:

> One should have at least the following variants of the resource:
> 
> - Original     : foo.wp  - WordPerfect 3.0 ~1982, perhaps still processable
> - Content      : foo.txt - textual, hopefully processable in 100 years
> - Presentation : foo.tif - TIFF ~1986, perhaps still viewable, might be foo.ps
> 
> So:
>  - http://example.com/foo     - negotiate and give me the best
>  - http://example.com/foo.wp  - I can still process WP
>  - http://example.com/foo.txt - I want to process the text, no presentation
>  - http://example.com/foo.tif - I really want to see how the doc looks
> 
> Regards
> Tomas
> 
>> Perhaps the way we can formulate this is to say that some document
>> formats (such as PDF, .doc / .docx and even .xls / .xlsx ) are
>> concerned with presentation of information in a particular format or
>> layout and therefore carry a significant amount of typesetting /
>> formatting information overhead in addition to the underlying data.
>> Furthermore, at the time those document-centric formats were
>> developed, ease of access to the underlying data and the unambiguous
>> meaning of specific data fields might not have been the main priority
>> in their design.
>> 
>> When the main priority is to ensure that the underlying data is
>> available on the web so that others can re-use it, we recommend using
>> simpler data formats such as CSV, TSV, JSON (or better still JSON-LD),
>> RDF or XML.
> 

Received on Wednesday, 12 August 2015 17:33:43 UTC