Re: Glossary defn of dataset, metadata

On 04/24/2015 04:46 PM, Phil Archer wrote:
> Hi,
>
> On 24/04/2015 20:26, yaso@nic.br wrote:
>> Yes, I liked that too!
>>
>> Phil, can I give you access for my fork of dwbp?
>
> I'd rather not. If you add the glossary doc to the WG's own repo I can 
> get it and edit it there. Don't be shy! If you want me to I can add 
> the ReSpec stuff and the IDs for the <dt> elements - which are 
> essential so that when terms are used in the other docs we can link to 
> the term. But, see below...

Okok, I'll do that!

>
> I'm still with
>> questions about the right place to put your text, if in the BP doc or in
>> the Glossary.
>
> I wrote that text with the glossary in mind, not the BP doc.
>
>>
>> And, I still miss more conexion between our lifecycle and this mental
>> models.
>
> In my mind - and it is only my mind - the examples you wrote, one of 
> which I used, are the mental models. I hope the text written today 
> actually shows all we need to show to prove that one person's metadata 
> is another person's data and that consumers become publishers. For me 
> that's enough - that and the basic CSV example *are* the mental models 
> and that's all we need for the glossary which is meant to just be a 
> set of terms.
>
> I'd leave discussion of the lifecycle to the BP doc
>
>  For me, the scope delineated by Deirdre needs to be explicitly
>> connected with the lifecycle that is cited in the BP doc at:
>>
>> "This section contains the best practices to be used by data publishers
>> in order to help them and data consumers to overcome the different
>> challenges faced during the data on the Web lifecycle."
>>
>> BTW, the word lifecycle shouldn't contain a link to the image proposed
>> by Bernadette[1]? It is difficult to identify which lifecycle we are
>> referring to...
>
> I thought we took the lifecycle out of the BP doc, no? If Berna finds 
> it useful she can put it back but, again, I don't think any of that 
> belongs in the glossary which is just a list of terms and definitions 
> - no?
>
Yes, I thought that too, but when reading the doc I find the expression 
"data on the Web lifecycle" and searched for the definition, as the 
figure is still in our repository at github, I have imagined that we 
were still working with this concept. Maybe it's the case of putting all 
old stuff in one directory and write a disclaimer...

> But... we could up the geek stuff here. How about creating a JSON 
> object with the definitions and put that on the web separately. Then 
> we can easily use it to auto-generate the glossary and use it to 
> create mouseovers for the terms when they're used in the other docs.

That's a fun idea, I can do that, but It will be finished on Thursday :-)
>
> WDYT?
>
> Phil
> (Signing off - late here)
>
>>
>>
>>
>> Yaso
>>
>> [1] https://github.com/w3c/dwbp/blob/gh-pages/images/lifecyclesvg.svg
>>
>>
>>
>> On 04/24/2015 11:05 AM, Annette Greiner wrote:
>>> I think this is great. I really like the way you describe the example.
>>> However, the bit about the overlap between data and metadata is a
>>> large amount of text for a very fine point. Could we keep that bit to
>>> one or two sentences at most? Right now I feel like the single biggest
>>> barrier to use of our document is its length.
>>> -Annette
>>> -- 
>>> Annette Greiner
>>> NERSC Data and Analytics Services
>>> Lawrence Berkeley National Laboratory
>>> 510-495-2935
>>>
>>> On Apr 24, 2015, at 6:33 AM, Phil Archer <phila@w3.org> wrote:
>>>
>>>> Eating some of my own dogfood...
>>>>
>>>> Yaso asked me for comment on her work on the mental models in the
>>>> glossary [1].
>>>>
>>>> I sent this suggested text:
>>>>
>>>> <h2>Data, Datasets, Metadata, Publishers and Re-Users</h2>
>>>>
>>>> <p>When discussing the publication and use of data on the Web, terms
>>>> like data, dataset and metadata are commonplace. In a <em>specific
>>>> context</em>, the differences between the terms can be clear. For
>>>> example, if a CSV file contains a series of numerical values those
>>>> values are the data, the totality of the data is the dataset and the
>>>> column and row headings are the metadata. Again emphasizing the
>>>> context, the simple 'metedata is data about data' definition works.
>>>> But, to recycle a sentence from 1997:</p>
>>>> <blockquote>The distinction between "data" and "metadata" is not an
>>>> absolute one; it is a distinction created primarily by a particular
>>>> application, and many times the same resource will be interpreted in
>>>> both ways simultaneously.' [RDF-INTRO]</blockquote>
>>>> <p>Imagine a system that scrapes the Web site of an online shop, adds
>>>> extra pictures and details and then publishes the resulting
>>>> information through an API. As far as the online shop is concerned,
>>>> the original data is metadata about the products on sale, but to the
>>>> person scraping the site, the metadata is now the data and the
>>>> enriched data must now be described with new metadata as part of the
>>>> API documentation. In this sequence, the data consumer becomes a data
>>>> publisher too of course.</p>
>>>> <p><strong>Therefore</strong>, in order to present a coherent set of
>>>> best practices, the working group takes the view that the same
>>>> artifacts (the same bytes), may be thought of as data in one context,
>>>> metadata in another, or indeed both simultaneously. Any re-user may
>>>> be a publisher, again, perhaps simultaneously. However, in 
>>>> context:</p>
>>>>
>>>> Data...
>>>>
>>>> Metadata...
>>>>
>>>>
>>>> "RDF-INTRO":{
>>>>         "authors":["Ora Lassila"],
>>>>         "href":"http://www.w3.org/TR/NOTE-rdf-simple-intro
>>>>         "title":"Introduction to RDF Metadata",
>>>>         "status":"Note",
>>>>         "publisher":"W3C",
>>>>         "date":"13 November 1997"
>>>>        }
>>>>
>>>>
>>>>
>>>> [1] http://yaso.is/dwbp/glossary.html
>>>>
>>>> -- 
>>>>
>>>>
>>>> Phil Archer
>>>> W3C Data Activity Lead
>>>> http://www.w3.org/2013/data/
>>>>
>>>> http://philarcher.org
>>>> +44 (0)7887 767755
>>>> @philarcher1
>>>>
>>>
>>
>>
>

Received on Monday, 27 April 2015 19:14:13 UTC