Re: audience for the BP doc

Hi Bernadette, thanks for your response.
See my comments inline.
-Annette

On Dec 16, 2014, at 12:01 PM, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:

> yes... I agree with you that until now we don't have BP for use by consumers of data. Since there are some requirements concerning data usage [1], we were planning to have some BP related to feedback and data usage. However, if we decide to focus just on data publishing, then they won't be developed. 
Maybe the data usage description vocabulary document can have the usage info for consumers.
> 
> I was reading the Introduction [2] again and I think that the phrases below are the ones who mention data publishers and data consumers. Just to make it more clear, could you please tell me if you agree or not with them:
> 
> "This document sets out a series of best practices that will help publishers and consumers face the new challenges and opportunities posed by data on the Web.”
disagree. This suggests that the BPs are also written for consumers.
I also think this is a little awkward, since data has been on the web for a long time, and I still don’t really understand why you want to use the word “consumers” here. The charter uses “developers”, as Eric astutely pointed out. You could say “This document sets out a series of practices intended to aid publishers in making their data more discoverable, usable, and trustworthy, and thus enabling developers to readily reuse it. " 
[Note that the charter says we aim “to foster trust in the data among developers”, not among journalists or scientists or the interested lay public (other consumers).] That paragraph also refers to the data lifecycle as data’s “life on the web”. I don’t think the data lifecycle all occurs on the web.
> 
> "Best practices cover different aspects related to data publishing and consumption, like data formats, data access, data identification and metadata.”
agree
> 
> and just one more question: do you agree that the BP may help to have a common understanding between data publishers and data consumers?
agree, but we are concerned in particular with understanding between publishers and developers.
> 
> Thanks again!
> Bernadette
> 
> 
> [1] http://www.w3.org/TR/dwbp-ucr/#h4_can-req-usage
> [2] http://w3c.github.io/dwbp/bp.html#intro
> 
> 
> 2014-12-16 16:43 GMT-03:00 Annette Greiner <amgreiner@lbl.gov>:
> I think the introduction is not suitable because it says that we are writing BPs for use by consumers of data, but none of our current BPs is written as a BP on which consumers (other than those who are re-publishing, and are therefore publishers) can take action. They do not address consumers as an audience.
> -Annette
> --
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory
> 510-495-2935
> 
> On Dec 16, 2014, at 11:34 AM, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:
> 
>> Hi Annette,
>> 
>> Thank you for answer! My comments are inline.
>> 
>> I think we need to have non-normative material that matches our normative material. This discussion started up because we have a disconnect there.
>> 
>> The first four sections of the document are non-normative and the idea is to use them to explain our context and to give definitions that are relevant for readers to understand the document. Maybe, instead of having a separate document we should try to improve these sections.
>>  
>> If we want to keep the introduction as is, we would need to change the best practices we are developing, broadening the scope considerably. I think it’s much less work to make the introduction work for the content it’s meant to introduce.
>> 
>> Could you please explain why the introduction is not suitable for the BP that will be developed? I'm sorry, but this is not clear for me.
>> 
>> It is important to note that BP will be developed according to the challenges/requirements identified in the Use Cases Document [1].
>>  
>> I’d be happy to take a stab at rewriting if you like. My feeling is that it doesn’t really need to change all that much, because we do want to still mention the importance of considering usage when you publish. (BTW, I think we should be trying to get publishers to think of putting data on the web as more than merely hosting files and administering the data. In fact, we have a list of things they should be thinking about: the best practices document.)
>> 
>> I agree with you! Data publishers have a really hard work to make data available on the Web and that's why the BP document is being proposed. 
>> 
>> kind regards,
>> Bernadette
>> 
>>  
>> [1] http://www.w3.org/TR/dwbp-ucr/
>> 
>> -Annette
>> 
>> --
>> Annette Greiner
>> NERSC Data and Analytics Services
>> Lawrence Berkeley National Laboratory
>> 510-495-2935
>> 
>> On Dec 16, 2014, at 10:26 AM, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:
>> 
>>> Hi all,
>>> 
>>> Thanks for your comments! 
>>> 
>>> I agree with Makx that it could be a good idea to concentrate on the audience of data providers (data publishers). However, if we do this then the whole discourse that was built until now has to be changed because we are always talking about data publication and data usage. For example, the first sentence of the abstract says: "This document provides best practices related to the publication and usage of data on the Web designed to help support a self-sustaining ecosystem".
>>> 
>>> Moreover, the document is about "Data on the Web Best Practices" and not only about "Publishing Data on the Web Best Practices". 
>>> 
>>> As proposed in the charter, the mission of our group includes: "to develop the open data ecosystem, facilitating better communication between developers and publishers;". In this sense, I think that it is also important to tell developers (or data consumers in general) how they can interact with data publishers, i.e., how they can provide feedback to data publishers and also how they can provide information that helps to find out how data has been used.
>>> 
>>> However, before we decide if we're gonna abandon the BP for data consumers, I think it is really important to have an agreement about the role of data publishers and data consumers.
>>> 
>>> In my point of view, data consumer concerns the one who wants to use data available on the Web to produce "something" instead of just reading the data. For example, when a developer uses raw data available on the Web to develop an application, then the developer plays the role of a data consumer and not the role of a data publisher. 
>>> 
>>> Concerning data publishers, I agree with Eric that "Publishers just focus on hosting and administering their data on the web in an orderly way".
>>> 
>>> kind regards,
>>> Bernadette
>>> 
>>> 
>>> 2014-12-16 8:36 GMT-03:00 Makx Dekkers <mail@makxdekkers.com>:
>>> Eric, Annette, all,
>>> 
>>> To me, it would make sense if we concentrated on the audience of data providers, at least for now. I think this is already a big order.
>>> 
>>> If we also want to cover best practices for the re-users of data (developers, aggregators, mix-and-matchers, brokers, whatever you want to call them), we’ll be spreading a scarce resource (ourselves) even thinner, and run the risk of producing two sets of insufficient quality.
>>> 
>>> Let’s focus on the data providers first and then, when we have a good set of best practices and still have time left, turn our attention to the consumer side of the picture.
>>> 
>>> Makx.
>>> 
>>> 
>>> 2014-12-16 6:29 GMT+01:00 Eric Stephan <ericphb@gmail.com>:
>>> Thanks Annette for sharing your thoughts on this topic in the meeting last week and in this email.  In your text the term consumers really jumped out at me.  If consumers only has a read-only connotation then I'd rather avoid this term altogether.  Actually consumers was never actually never mentioned originally as part of the working group mission, instead the term "developer" was used.  
>>> 
>>> Developers to me, are technologists building applications and devices that reuse published data, including creating new data that can be published, processing and modifying published data, or strictly reading data in the life span of a running application. Users rely on the tools created by publishers and developers to edit published data and provide feedback.  Publishers to me just focus on hosting and administering their data on the web in an orderly way.  Since the original intent of BP was to "facilitate better communication between developers and publishers.'  Maybe there should be best practices that target publishers and developers divided into two documents.
>>> 
>>> The closest analogy is that off the shelf data storage systems two types of documentation are written:
>>> 1) Data administrators who manage the data system
>>> 2) End users (developers) who write applications that interact with the data system
>>> 
>>> Thanks,
>>> 
>>> Eric S
>>>  
>>> 
>>> On Mon, Dec 15, 2014 at 1:08 PM, Annette Greiner <amgreiner@lbl.gov> wrote:
>>> Hi folks,
>>> To pick up the discussion about our audience, I want to set down what I see as our audience for the current BP document. By audience I mean the people we expect to actually sit down and read it, not the people whose interests we need to consider in creating it (those are what I call stakeholders). It’s possible that we all agree but are just thinking of the terms differently.
>>> 
>>> To my mind, our audience includes anyone involved in making data available to consumers on the web. That is publishing data. It includes anyone who collects or collates the data, organizes the data, creates web pages or apps to share the data, re-publishes it in such a way that others can re-use it, or makes decisions relevant to how people do those tasks. They could be developers, lawyers, CIOs, researchers, archivists, designers, almost any job title. What matters, though, is not their job title but what actions they take with respect to the data. The action of consuming it is not what we have been discussing, it isn’t represented in any of the current best practices or in our scoping criteria, and it isn’t called for in the charter’s requirement to create a BP document. Thus far, we are not targeting our BPs to people who are *only* consuming the data and not republishing it.
>>> 
>>> I’ve already talked about the charter and the existing BPs in a previous email, so I’ll just address the scoping criteria here. The first one, being unique to publishing on the web, is obviously about publishing rather than consuming. The second one, encouraging reuse, is also about publishing, just in such a way that someone else can make use of the data. The charter mentions re-use in its mission in list item 2, which calls on us to "provide _guidance_to_publishers_ that will improve consistency in the way data is managed, thus promoting the re-use of data". If a consumer wants to publish something that makes the data truly re-usable, they must include the data itself, which means that they are publishing the data. The third criterion, testability, simply deals with the mechanics of making sure that one is successful in achieving the best practices.
>>> 
>>> It might help to consider an example: your organization publishes data about traffic in Rio. It's made available through an API. A data scientist in Lisbon is interested in the data and makes a visualization based on it that she posts on her blog. The data scientist does not make the data available in any form other than the visualization itself. She has not really enriched your data, because the original data still has no connection to the visualization. She cannot take action on any of the best practices we have identified thus far unless she re-publishes it herself, as data.
>>> 
>>> Your organization could link to the visualization, thereby enriching the data, but the data scientist in Lisbon cannot force it to do that. Our best practice around data enrichment calls on publishers to consider making that link or creating the visualization themselves. If we were writing that same best practice for a consumer audience, it would have to say something like "you should enrich other people's data". So, we would end up telling data enrichers that they should enrich data, which strikes me as tautological. One could go into detail about how to make good visualizations (use good labels, don’t rely on color alone, provide a zero point in your scales, etc.), but that seems to me out of scope. (I teach an entire semester course on visualization, so I could come up with lots of best practices about it, but I don't think we want to go there in the BP document we’ve been working on.)
>>> 
>>> Now suppose the consumer in Lisbon would like to provide feedback. If we, as the publisher, have not provided a mechanism for them to do so, they cannot provide it. Our best practice is about making it possible to provide feedback and then acting on the feedback to improve the published data. A consumer has a role here, but again, there is little point to telling a consumer who wants to give feedback that they should give feedback. I certainly wouldn’t expect a data consumer to wade through a long list of publisher-oriented best practices to be told that they should give feedback whenever they are so inclined.
>>> 
>>> I would support the idea of putting together a separate list of best practices for data consumers if we can think of a way to scope it that works.
>>> 
>>> -Annette
>>> 
>>> 
>>> --
>>> Annette Greiner
>>> NERSC Data and Analytics Services
>>> Lawrence Berkeley National Laboratory
>>> 510-495-2935
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> --------------------------------------------------------------------------------
>>> Makx Dekkers
>>> mail@makxdekkers.com 
>>> --------------------------------------------------------------------------------
>>> 
>>> 
>>> -- 
>>> Bernadette Farias Lóscio
>>> Centro de Informática
>>> Universidade Federal de Pernambuco - UFPE, Brazil
>>> ----------------------------------------------------------------------------
>> 
>> 
>> 
>> -- 
>> Bernadette Farias Lóscio
>> Centro de Informática
>> Universidade Federal de Pernambuco - UFPE, Brazil
>> ----------------------------------------------------------------------------
> 
> 
> 
> -- 
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
> ----------------------------------------------------------------------------

Received on Tuesday, 16 December 2014 22:47:21 UTC