W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2016

Re: Review of BP on data re-use

From: Deirdre Lee <deirdre@derilinx.com>
Date: Wed, 23 Mar 2016 11:15:28 +0000
To: public-dwbp-wg@w3.org
Message-ID: <56F27AD0.4040809@derilinx.com>

I'm afraid I won't be on the call today, but in terms of BP on re-use:

Overall, I don't think there should be a new section for reuse.

The only way this BP makes sense to me, is if it is presented in the 
sense of  published data being part of a wider data lifecycle, 
publish-use-republish-reuse etc.
This may fit into data enrichment section:
'Data enrichment refers to a set of processes that can be used to 
enhance, refine or otherwise improve raw or previously processed data.'

My suggestion: If we keep the BP, it should sit in the data enrichment 
section, and content updated to emphasise data lifecycle.

I'm sure it'll be an interesting call today!

On 22/03/2016 22:06, Annette Greiner wrote:
> Responding to some of the comments on the data reuse BP.
> Re the ideas from Europeana, the appropriateness of making sure that 
> the license travels with the data depends on the license itself. Some 
> specify that derivative works follow the same license or a compatible 
> license. Getting that right is part of following the license 
> requirements. I think Antoine's thought about keeping the data up to 
> date is a good one, though that's covered in the original BP about 
> data-up-to-date, since we say to update data when the source is 
> updated. A small reminder in the reuse BP would seem fitting. The 
> Europeana page also mentions the case where the reuser changes 
> something about the data, saying that one should mention what was 
> changed. I think that would be another idea worth mentioning.
> Re whether it's in scope, this goes back to the original discussion 
> about who our audience is. I would never have argued that our audience 
> was specifically publishers if I didn't also believe that re-users, or 
> re-publishers, are part of that group. Our charter charges us with 
> "facilitating better communication between developers and publishers." 
> We've recognized that developers are publishers, too, but we haven't 
> addressed the original issue, which is really poor communication 
> between original publishers and re-publishers. We haven't addressed 
> anything that applies particularly to the challenge of re-publishing. 
> In this BP, we finally do that. I feel that, if we were to leave it 
> out, the list of BPs could leave publishers who are not also 
> re-publishers feeling that they are the only ones tasked with 
> improving their behavior. Communication is a two-way street, and I 
> think addressing re-publishers is something we need to do to maintain 
> balance. Having thought about this, I would not be comfortable 
> publishing a BP list without these ideas in it. I am far, far less 
> concerned about issues of scope than issues of balance and fairness, 
> but I think this BP is firmly in scope and necessary.
> Re the idea that we should split it into 2 BPs. The two-way split 
> doesn't strike me as logical, because citing and providing feedback 
> are two completely different tasks. I could possibly imagine splitting 
> it into three BPs, as there are three components to reusing 
> respectfully (at least currently). However, I'm not sure what we gain 
> by splitting them up and putting them into other sections. That makes 
> it difficult for users to find advice on what to do when they are 
> reusing someone else's data. We could possibly split them into three 
> and have all three in a new section, but I'm not sure there are really 
> three unique BPs-worth of things to say about them. Either way, this 
> is really a new challenge: how to reuse with consideration for the 
> original publisher, and worthy of a new section.
> It's an interesting idea to put the ideas about reuse into existing 
> BPs, but I think that would force us to try and stuff two different 
> ideas into each BP. We would have to find a partner BP for each one 
> and rewrite. Supposing we felt it was worth that effort, we would end 
> up with BPs that are trying awkwardly to encompass two different 
> ideas. Keeping them separate helps understanding and keeps the BPs 
> from becoming overloaded. It is one task to provide a channel for 
> communication; it is quite another to use it, and it's still another 
> to cite a source. Similarly, it is one task to provide a license; it 
> is quite another to follow it.
> -Annette
> On 3/22/16 9:25 AM, Laufer wrote:
>> Hi All,
>> I do not agree with a new section and a new BP about data reuse.
>> I think that the aspects of reuse that are mentioned in the new BP 
>> are covered by the BPs in our list: license, provenance and feedback.
>> If someone wants to use, or reuse, data she has to think about theses 
>> aspects and has to do what our BPs recommend.
>> If the group think that these aspects should be highlighted, I think 
>> that we can include these information in the original BPs.
>> If we will talk about BPs for reuse we will need to see all the other 
>> aspects of publications, as for example, how versioning will be 
>> treated, how sensitive data will be treated, how the use of new 
>> vocabularies will be compatible with the vocabularies used in the 
>> data reused, and so on.
>> I do not like the idea that reuse is not use. I think that in some 
>> sense we are thinking that the only one that uses data is the final 
>> user. But I think that the final user do not uses data. She asks a 
>> question that someone that uses data will try to answer.
>> All of our BPs include the benefit of Reuse. We do not even talk 
>> about the benefit of Use.
>> For me, our BPs cover the publishing of data that will be used. Or 
>> reused, as you wish. I do not think we have to split in different BPs.
>> Cheers, Laufer
>> Bes
>> ---
>> .  .  .  .. .  .
>> .        .   . ..
>> .     ..       .
>> Em 22/03/2016 12:50, Bernadette Farias Lóscio escreveu:
>>> Hi all,
>>> Considering that tomorrow we need to vote to include or not the BP 
>>> about Data Re-use [1] on the BP document, I'd like to make some 
>>> considerations.
>>> I agree with Antoine that "a lot of the aspects of this BP are 
>>> non-technical, so I'm not 100% sure it's in scope." However, I also 
>>> like the idea of the BP and I'd like to make a proposal.
>>> In my opinion, the Data Reuse BP should be splitted in two different 
>>> BP: one for data licenses and another one for Citation and Feedback. 
>>> We already have a section about data licenses, so I think It would 
>>> be better to create a new BP considering the aspects mentioned by 
>>> Antoine and Annette. If reusing is also a way of publishing data, 
>>> then I think it won't be a problem.
>>> The second BP will focus on providing citation and feedback. I also 
>>> believe that are other aspects that should be considered. Annette's 
>>> proposal mentions that publishers "should be made aware of any known 
>>> problems with the data". However, feedback can be used to provide 
>>> other informations about the dataset and not just to provide 
>>> feedback about the problems. It is also really important to mention 
>>> the Dataset Usage Vocabulary and to provide examples based on our 
>>> own vocabulary.
>>> In this case, we can also change the title of the section Feedback 
>>> to be something like Feedback and Citation.
>>> In summary, my proposal is:
>>> - Split Data Reuse BP in two BP:
>>> BP: Follow licensing constraints to be included in the Data Licenses 
>>> Section
>>> BP: Cite the original dataset and give feedback (this could also be 
>>> splitted in two other BP: i)  BP Cite the original dataset and ii) 
>>> Give feedback )
>>> - Rename Feedback Section to Feedback and Citation.
>>> Doing this, we also avoid the creation of a new section. Again, if 
>>> reusing as way of publishing then I dont think that we should have a 
>>> new section for this subject.
>>> kind regards,
>>> Berna
>>> [1] http://agreiner.github.io/dwbp/bp.html#Re-use
>>> 2016-03-16 9:19 GMT-03:00 Antoine Isaac <aisaac@few.vu.nl 
>>> <mailto:aisaac@few.vu.nl>>:
>>>     Hi everyone,
>>>     I've just received the email with the editors asking for this:
>>>         2. To review the Best Practice: Reuse vocabularies [3] ,
>>>         which will be voted next Wednesday.
>>>     This is excellent timing, I've just read it while catching up
>>>     with the minutes of yesterday's session ;-)
>>>     My feedback will be quick though (not much time to write a clean
>>>     text!):
>>>     1. a lot of the aspects of this BP are non-technical, so I'm not
>>>     100% sure it's in scope. But there are some technical aspects
>>>     involved, and see point #2.
>>>     2. I do like the BP a lot. This makes a lot of sense
>>>     3. my strong recommendation about licensing would be that
>>>     re-users should make sure that any license or terms of use
>>>     'travels' with the data. If reusers do something with the data,
>>>     they make sure it's compatible with the license and terms of
>>>     use. This includes (re-)publishing of data, or of derived data
>>>     when applicable. Especially re-users of derived or re-published
>>>     data must be aware of the original license and terms of use
>>>     4. my organization (Europeana) has made terms of use that could
>>>     be used as example. Our data is CC0, so there's no license
>>>     whatsoever. But because attribution and provenance matter in our
>>>     sector (culture) we wanted to encourage people to be 'respectful'.
>>>     It's at http://www.europeana.eu/portal/rights/metadata.html
>>>     I think it exemplifies quite a lot the aspects of Annette's BP
>>>     proposal.
>>>     5. the Europeana TOU include one technical aspect that could be
>>>     strenghtened in the BP, imhp. Re-users should make sure they
>>>     keep their data (or application) synchronization with the most
>>>     up-to-date status of the original source. If someone builds and
>>>     keeps something on the basis of old data, and let their own
>>>     re-users think the original data source is responsible for
>>>     problems of outdated data, this is not fair for the original
>>>     data publisher.
>>>     Cheers,
>>>     Antoine
>>>     [3] http://agreiner.github.io/dwbp/bp.html#Re-use
>>> -- 
>>> Bernadette Farias Lóscio
>>> Centro de Informática
>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>> ----------------------------------------------------------------------------
> -- 
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory

Deirdre Lee, CEO & Founder
Derilinx - Linked & Open Data Solutions
Web:      www.derilinx.com
Email:    deirdre@derilinx.com
Address:  11/12 Baggot Court, Dublin 2, D02 F891
Tel:      +353 (0)1 254 4316
Mob:      +353 (0)87 417 2318
Linkedin: ie.linkedin.com/in/leedeirdre/
Twitter:  @deirdrelee
Received on Wednesday, 23 March 2016 11:16:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 23 March 2016 11:16:34 UTC