Glossary defn of dataset, metadata

Eating some of my own dogfood...

Yaso asked me for comment on her work on the mental models in the 
glossary [1].

I sent this suggested text:

<h2>Data, Datasets, Metadata, Publishers and Re-Users</h2>

<p>When discussing the publication and use of data on the Web, terms 
like data, dataset and metadata are commonplace. In a <em>specific 
context</em>, the differences between the terms can be clear. For 
example, if a CSV file contains a series of numerical values those 
values are the data, the totality of the data is the dataset and the 
column and row headings are the metadata. Again emphasizing the context, 
the simple 'metedata is data about data' definition works. But, to 
recycle a sentence from 1997:</p>
<blockquote>The distinction between "data" and "metadata" is not an 
absolute one; it is a distinction created primarily by a particular 
application, and many times the same resource will be interpreted in 
both ways simultaneously.' [RDF-INTRO]</blockquote>
<p>Imagine a system that scrapes the Web site of an online shop, adds 
extra pictures and details and then publishes the resulting information 
through an API. As far as the online shop is concerned, the original 
data is metadata about the products on sale, but to the person scraping 
the site, the metadata is now the data and the enriched data must now be 
described with new metadata as part of the API documentation. In this 
sequence, the data consumer becomes a data publisher too of course.</p>
<p><strong>Therefore</strong>, in order to present a coherent set of 
best practices, the working group takes the view that the same artifacts 
(the same bytes), may be thought of as data in one context, metadata in 
another, or indeed both simultaneously. Any re-user may be a publisher, 
again, perhaps simultaneously. However, in context:</p>

Data...

Metadata...


"RDF-INTRO":{
         "authors":["Ora Lassila"],
         "href":"http://www.w3.org/TR/NOTE-rdf-simple-intro
         "title":"Introduction to RDF Metadata",
         "status":"Note",
         "publisher":"W3C",
         "date":"13 November 1997"
        }



[1] http://yaso.is/dwbp/glossary.html

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1

Received on Friday, 24 April 2015 13:34:06 UTC