RE: Use machine-readable standardized data formats / Use non-proprietary data formats from Manuel.CARRASCO-BENITEZ@ec.europa.eu on 2015-08-12 (public-dwbp-wg@w3.org from August 2015)

From: <Manuel.CARRASCO-BENITEZ@ec.europa.eu>
Date: Wed, 12 Aug 2015 14:28:38 +0000
To: <phila@w3.org>, <mark.harrison@cantab.net>
CC: <public-dwbp-wg@w3.org>
Message-ID: <39DB516E46C0E842A2CFFF1BBB7412F16F30EB76@S-DC-ESTF03-B.net1.cec.eu.int>

One should have at least the following variants of the resource:

- Original     : foo.wp  - WordPerfect 3.0 ~1982, perhaps still processable
- Content      : foo.txt - textual, hopefully processable in 100 years
- Presentation : foo.tif - TIFF ~1986, perhaps still viewable, might be foo.ps

So:
  - http://example.com/foo     - negotiate and give me the best
  - http://example.com/foo.wp  - I can still process WP
  - http://example.com/foo.txt - I want to process the text, no presentation
  - http://example.com/foo.tif - I really want to see how the doc looks

Regards
Tomas

> Perhaps the way we can formulate this is to say that some document
> formats (such as PDF, .doc / .docx and even .xls / .xlsx ) are
> concerned with presentation of information in a particular format or
> layout and therefore carry a significant amount of typesetting /
> formatting information overhead in addition to the underlying data.
> Furthermore, at the time those document-centric formats were
> developed, ease of access to the underlying data and the unambiguous
> meaning of specific data fields might not have been the main priority
> in their design.
>
> When the main priority is to ensure that the underlying data is
> available on the web so that others can re-use it, we recommend using
> simpler data formats such as CSV, TSV, JSON (or better still JSON-LD),
> RDF or XML.

Received on Wednesday, 12 August 2015 14:29:11 UTC