Re: Fwd: Best Practice 26.docx from Annette Greiner on 2015-12-22 (public-dwbp-wg@w3.org from December 2015)

From: Annette Greiner <amgreiner@lbl.gov>
Date: Mon, 21 Dec 2015 18:13:55 -0800
To: Peter.Winstanley@gov.scot, public-dwbp-wg@w3.org
Message-ID: <5678B1E3.3060407@lbl.gov>
Hi Peter,
Your point about APIs enabling some restriction in what users have 
access to is a good one, and I completely agree with it. That's what I 
am referring to when I talk about subsetting data. I see that as an 
important part of what makes using an API worthwhile. Your text seemed 
to be saying that an API was a way to avoid subsetting, which may have 
just been an ambiguity in the phrasing.

Regarding simplicity of use, I admit that in many circumstances using an 
API can be simpler than using an alternative, but that is not always the 
case, and I don't think it's the true advantage of an API. The true 
value is that it is an intentionally built programming interface. One 
needn't rely on brittle scraping or complex workflows that involve 
downloading and parsing through unnecessary data to force it to 
interface with the rest of one's code.

My concern with the question of how difficult it is to set up an API is 
that we remain realistic about the effort involved. Most people will be 
comparing the option of setting up an API against the option of simply 
posting datasets for download as files. The latter is definitely easier 
to do, so it would not be accurate to offer building an API as an easier 
option. Admittedly, if the infrastructure is already in place, enabling 
an API in a data management system can be pretty easy, though setting it 
up correctly and documenting things takes a bit of work and an 
understanding of what's going on under the hood. Compared to copying a 
file into a directory, it's a step up.

cheers,
-Annette




On 12/21/15 4:53 AM, Peter.Winstanley@gov.scot wrote:
> Hi Annette
>
> re: resource-intensive queries When trying to maintain a quality of service one might want to prevent access to specific sets of queries that would involve significant table scans or high memory consumption, and the use of an API as an alternative to e.g. an open sparql endpoint is a way of constraining the query options so as to protect the overall service
>
> The simplicity angle is an important one insofar as it is part of the democratisation of access to data on the web.  If we can simplify the process of accessing datastores (e.g. through APIs) then a wider range of people will begin to make data-driven applications etc.
>
> The same applies to the 'elementary programming' bit.  If we don't let people know that it is not rocket science to provide  a simple API to some simple data sets then they may body-swerve a useful additional bit of work.  Many people who commission work that gets data onto the web are driven by the need to show a website and not by the need to provide an API.  We need to use the BPs paper as an opportunity not only to give guidance and "what" and "why" but also some insight into the challenges of "how" and to help people overcome any inertia preventing adoption.
>
> The goal of the re-working was simply because in the meeting it was one of the elements of the document that was identified as being incomplete and in need of some work.
>
> Peter
>
> -----Original Message-----
> From: Annette Greiner [mailto:amgreiner@lbl.gov]
> Sent: 14 December 2015 21:24
> To: public-dwbp-wg@w3.org
> Subject: Re: Fwd: Best Practice 26.docx
>
> Peter,
> Thanks for working to improve this.
>
> While I like the idea of explaining what an API is for those who may be
> less familiar, we should be careful about how we define it. The main
> alternatives to an API for web developers are downloads and scraping,
> which are actually pretty simple but tedious approaches. I think the
> value of an API for web development is not so much a matter of greater
> simplicity but in having actual programmatic access, or hooks into the
> data. The point is that an API is designed to explicitly enable
> programming, whereas reusing without that requires grabbing more than
> you want and munging the data. The last sentence of the first paragraph
> suggests that REST is the only way to make an API, which is not the
> case. Let's leave that argument out of this BP, as it's handled elsewhere.
>
> The second paragraph now reiterates the simplicity concept, which I
> don't think is accurate or particularly helpful. As for protecting
> against resource-intensive subsetting, I'm not sure what you mean. The
> alternatives to using an API are not about subsetting and are not
> particularly resource intensive; subsetting is actually a virtue of
> using an API, because it allows one to download only the data needed
> (something I've been pushing for a BP about for a long time, BTW).
> Regarding other transport protocols than HTTP, I'm not sure what that
> has to do with the intended outcome.
>
> As for the third paragraph, again, I don't think we should get into the
> how-to-implement-REST discussion here. There is another BP for that.
> Also, the suggestion that creating a web API for relational data is
> "elementary programming" whereas RDF "can be provided with more
> sophisticated APIs" strikes me as potentially a bit insulting to devs
> who work with relational data.
>
> I'm curious what the goal of this reworking was. Perhaps we can find
> other ways to address the underlying issues.
> -Annette
>
> On 12/11/15 7:08 AM, Phil Archer wrote:
>> This should be in the mail archive (Peter used an alternative e-mail
>> address which is why it bounced)
>>
>>
>> -------- Forwarded Message --------
>> Subject: Best Practice 26.docx
>> Date: Fri, 11 Dec 2015 14:43:55 +0000
>> From: Peter.Winstanley@gov.scot
>> To: public-dwbp-wg@w3.org
>> CC: phila@w3.org, laufer@globo.com
>>
>>
>>
>> I have tried make some steps to improve the BP #26 from
>> http://w3c.github.io/dwbp/bp.html#useanAPI
>>
>> Hope it is a helpful move.  It you think the direction is right then
>> let me know and I'll complete.
>>
>> Peter
>>
>>
>> **********************************************************************
>> This e-mail (and any files or other attachments transmitted with it)
>> is intended solely for the attention of the addressee(s). Unauthorised
>> use, disclosure, storage, copying or distribution of any part of this
>> e-mail is not permitted. If you are not the intended recipient please
>> destroy the email, remove any copies from your system and inform the
>> sender immediately by return.
>>
>> Communications with the Scottish Government may be monitored or
>> recorded in order to secure the effective operation of the system and
>> for other lawful purposes. The views or opinions contained within this
>> e-mail may not necessarily reflect those of the Scottish Government.
>>
>>
>> Tha am post-d seo (agus faidhle neo ceanglan  cÃ²mhla ris) dhan neach
>> neo luchd-ainmichte a-mhÃ in. Chan eil e ceadaichte a chleachdadh ann
>> an dÃ²igh sam bith, aâ toirt a-steach cÃ²raichean, foillseachadh neo
>> sgaoileadh,  gun chead. Ma âs e is gun dâfhuair sibh seo le
>> gun fhiosdâ, bu choir cur Ã s dhan phost-d agus lethbhreac sam bith
>> air an t-siostam agaibh, leig fios chun  neach a sgaoil am post-d  gun
>> dÃ il.
>>
>> Dhâfhaodadh gum bi teachdaireachd sam bith bho Riaghaltas na h-Alba
>> air a chlÃ radh neo air a sgrÃ¹dadh airson dearbhadh gu bheil an
>> siostam ag obair gu h-Ã¨ifeachdach neo airson adhbhar laghail
>> eile. Dhâfhaodadh nach  eil beachdan anns aâ phost-d seo co-ionann
>> ri beachdan Riaghaltas na h-Alba.
>> **********************************************************************
>>
>>
>>
>> The original of this email was scanned for viruses by the Government
>> Secure Intranet virus scanning service supplied by Vodafone in
>> partnership with Symantec. (CCTM Certificate Number 2009/09/0052.)
>> This email has been certified virus free.
>> Communications via the GSi may be automatically logged, monitored
>> and/or recorded for legal purposes.
>>
>>
>>

-- 
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
Received on Tuesday, 22 December 2015 02:14:37 UTC