Re: Use machine-readable standardized data formats / Use non-proprietary data formats

hello annette.

On 2015-08-12 11:01, Annette Greiner wrote:
> When it comes to specifying which formats to use, I do think that the best practice is to consider the probable context of use. That is fundamental to any intelligent management of data. Of course we should suggest open formats in particular, but the idea of considering how something will/may be used in the future is important, too, and it helps one decide among the possible open formats. Perhaps just rewording would make that clearer. For the test, it seems to me that conforming to a format in use by anticipated users of the data is a minimum ( Something that already doesn’t work for your users certainly isn’t going to be future-proof.), but it should also say something about checking that the format conforms to an open machine-readable standard. That BP has a sentence fragment right now as well, and I don’t think it should mention vocabularies.

i would be in favor of not recommending any specific models or 
metamodels. that's up for the domain specialists to decide, and depends 
on what their goals and constraints are.

what matters is:

- the format should be easily parseable, and thus it is better to reuse 
some metamodel than to invent your own or simply invent a proprietary 
format that needs proprietary parsing.

- after parsing, the model should be documented so that it is 
well-defined what the parsed data model means and most importantly, how 
it has to be processed (what to do with unknown parts, for example: 
safely ignore or stop processing?). therefore a clearly defined 
processing model is essential at this level.

- the model should be hypermedia, so that clients can follow meaningful 
links for accomplishing application goals. those links may simply link 
to related data, or they may the RESTful and link to web services and 
not just web data. again, that's for the domain to decide what kind of 
hypermedia semantics they need.

because these issues are essential, https://github.com/dret/webdata has 
these three points as its very core, as 2, 3, and 4 stars. these issues 
are at the core of being webby data, and not just some data that happens 
to be accessible via HTTP (i.e., web data has to be "of the web" and not 
just "on the web").

cheers,

dret.

-- 
erik wilde | mailto:dret@berkeley.edu  -  tel:+1-510-2061079 |
            | UC Berkeley  -  School of Information (ISchool) |
            | http://dret.net/netdret http://twitter.com/dret |

Received on Wednesday, 12 August 2015 18:12:18 UTC