Re: Glossary defn of dataset, metadata

Yes, I liked that too!

Phil, can I give you access for my fork of dwbp? I'm still with 
questions about the right place to put your text, if in the BP doc or in 
the Glossary.

And, I still miss more conexion between our lifecycle and this mental 
models. For me, the scope delineated by Deirdre needs to be explicitly 
connected with the lifecycle that is cited in the BP doc at:

"This section contains the best practices to be used by data publishers 
in order to help them and data consumers to overcome the different 
challenges faced during the data on the Web lifecycle."

BTW, the word lifecycle shouldn't contain a link to the image proposed 
by Bernadette[1]? It is difficult to identify which lifecycle we are 
referring to...



Yaso

[1] https://github.com/w3c/dwbp/blob/gh-pages/images/lifecyclesvg.svg



On 04/24/2015 11:05 AM, Annette Greiner wrote:
> I think this is great. I really like the way you describe the example. However, the bit about the overlap between data and metadata is a large amount of text for a very fine point. Could we keep that bit to one or two sentences at most? Right now I feel like the single biggest barrier to use of our document is its length.
> -Annette
> --
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory
> 510-495-2935
>
> On Apr 24, 2015, at 6:33 AM, Phil Archer <phila@w3.org> wrote:
>
>> Eating some of my own dogfood...
>>
>> Yaso asked me for comment on her work on the mental models in the glossary [1].
>>
>> I sent this suggested text:
>>
>> <h2>Data, Datasets, Metadata, Publishers and Re-Users</h2>
>>
>> <p>When discussing the publication and use of data on the Web, terms like data, dataset and metadata are commonplace. In a <em>specific context</em>, the differences between the terms can be clear. For example, if a CSV file contains a series of numerical values those values are the data, the totality of the data is the dataset and the column and row headings are the metadata. Again emphasizing the context, the simple 'metedata is data about data' definition works. But, to recycle a sentence from 1997:</p>
>> <blockquote>The distinction between "data" and "metadata" is not an absolute one; it is a distinction created primarily by a particular application, and many times the same resource will be interpreted in both ways simultaneously.' [RDF-INTRO]</blockquote>
>> <p>Imagine a system that scrapes the Web site of an online shop, adds extra pictures and details and then publishes the resulting information through an API. As far as the online shop is concerned, the original data is metadata about the products on sale, but to the person scraping the site, the metadata is now the data and the enriched data must now be described with new metadata as part of the API documentation. In this sequence, the data consumer becomes a data publisher too of course.</p>
>> <p><strong>Therefore</strong>, in order to present a coherent set of best practices, the working group takes the view that the same artifacts (the same bytes), may be thought of as data in one context, metadata in another, or indeed both simultaneously. Any re-user may be a publisher, again, perhaps simultaneously. However, in context:</p>
>>
>> Data...
>>
>> Metadata...
>>
>>
>> "RDF-INTRO":{
>>         "authors":["Ora Lassila"],
>>         "href":"http://www.w3.org/TR/NOTE-rdf-simple-intro
>>         "title":"Introduction to RDF Metadata",
>>         "status":"Note",
>>         "publisher":"W3C",
>>         "date":"13 November 1997"
>>        }
>>
>>
>>
>> [1] http://yaso.is/dwbp/glossary.html
>>
>> -- 
>>
>>
>> Phil Archer
>> W3C Data Activity Lead
>> http://www.w3.org/2013/data/
>>
>> http://philarcher.org
>> +44 (0)7887 767755
>> @philarcher1
>>
>

Received on Friday, 24 April 2015 19:26:55 UTC