- From: Pieter Colpaert <pieter.colpaert@ugent.be>
- Date: Tue, 22 Dec 2015 10:06:08 +0100
- To: public-dwbp-wg@w3.org
Hi Annette, Very interesting view which is very close to the (Linked) Data Fragments (LDF) axis idea [1,2]. It sets out the two extremes, data dump vs. query API services, out on an axis. The axis indicates that there are various options in between these two extremes. The left-hand side could be seen as "data publishing", where a dataset can be fragmented in different subsets. The more the dataset is fragmented, the more expressive we could say the interface becomes, and the more we're moving to the right on the axis. I believe this way of thinking is perfect for publishing data for maximized reuse, over publishing a service for maximum expressiveness. The right-hand side could be seen as "data services", where a HTTP service exposes all query functionality. This becomes tedious to keep high available, as on the open Web, you cannot predict the type of queries you'll receive. The query interface will be limited in expressiveness, moving to the left on the axis, to make sure it can serve specific use cases. I believe this way of thinking is perfect for data services that need to provide functionalities to apps. When describing best practices for data on the Web, I'd only describe the left hand-side: more expressive servers by fragmenting datasets. This way, high availability, maximized reuse and the REST principles come first. [1] Standard LDF axis: https://speakerdeck.com/pietercolpaert/using-open-data-to-promote-data-innovation-1?slide=12 [2] LDF axis for transport data: https://speakerdeck.com/pietercolpaert/linked-connections?slide=15 Kind regards, Pieter On 22-12-15 03:13, Annette Greiner wrote: > Hi Peter, > Your point about APIs enabling some restriction in what users have > access to is a good one, and I completely agree with it. That's what I > am referring to when I talk about subsetting data. I see that as an > important part of what makes using an API worthwhile. Your text seemed > to be saying that an API was a way to avoid subsetting, which may have > just been an ambiguity in the phrasing. > > Regarding simplicity of use, I admit that in many circumstances using > an API can be simpler than using an alternative, but that is not > always the case, and I don't think it's the true advantage of an API. > The true value is that it is an intentionally built programming > interface. One needn't rely on brittle scraping or complex workflows > that involve downloading and parsing through unnecessary data to force > it to interface with the rest of one's code. > > My concern with the question of how difficult it is to set up an API > is that we remain realistic about the effort involved. Most people > will be comparing the option of setting up an API against the option > of simply posting datasets for download as files. The latter is > definitely easier to do, so it would not be accurate to offer building > an API as an easier option. Admittedly, if the infrastructure is > already in place, enabling an API in a data management system can be > pretty easy, though setting it up correctly and documenting things > takes a bit of work and an understanding of what's going on under the > hood. Compared to copying a file into a directory, it's a step up. > > cheers, > -Annette > > > > > On 12/21/15 4:53 AM, Peter.Winstanley@gov.scot wrote: >> Hi Annette >> >> re: resource-intensive queries When trying to maintain a quality of >> service one might want to prevent access to specific sets of queries >> that would involve significant table scans or high memory >> consumption, and the use of an API as an alternative to e.g. an open >> sparql endpoint is a way of constraining the query options so as to >> protect the overall service >> >> The simplicity angle is an important one insofar as it is part of the >> democratisation of access to data on the web. If we can simplify the >> process of accessing datastores (e.g. through APIs) then a wider >> range of people will begin to make data-driven applications etc. >> >> The same applies to the 'elementary programming' bit. If we don't >> let people know that it is not rocket science to provide a simple API >> to some simple data sets then they may body-swerve a useful >> additional bit of work. Many people who commission work that gets >> data onto the web are driven by the need to show a website and not by >> the need to provide an API. We need to use the BPs paper as an >> opportunity not only to give guidance and "what" and "why" but also >> some insight into the challenges of "how" and to help people overcome >> any inertia preventing adoption. >> >> The goal of the re-working was simply because in the meeting it was >> one of the elements of the document that was identified as being >> incomplete and in need of some work. >> >> Peter >> >> -----Original Message----- >> From: Annette Greiner [mailto:amgreiner@lbl.gov] >> Sent: 14 December 2015 21:24 >> To: public-dwbp-wg@w3.org >> Subject: Re: Fwd: Best Practice 26.docx >> >> Peter, >> Thanks for working to improve this. >> >> While I like the idea of explaining what an API is for those who may be >> less familiar, we should be careful about how we define it. The main >> alternatives to an API for web developers are downloads and scraping, >> which are actually pretty simple but tedious approaches. I think the >> value of an API for web development is not so much a matter of greater >> simplicity but in having actual programmatic access, or hooks into the >> data. The point is that an API is designed to explicitly enable >> programming, whereas reusing without that requires grabbing more than >> you want and munging the data. The last sentence of the first paragraph >> suggests that REST is the only way to make an API, which is not the >> case. Let's leave that argument out of this BP, as it's handled >> elsewhere. >> >> The second paragraph now reiterates the simplicity concept, which I >> don't think is accurate or particularly helpful. As for protecting >> against resource-intensive subsetting, I'm not sure what you mean. The >> alternatives to using an API are not about subsetting and are not >> particularly resource intensive; subsetting is actually a virtue of >> using an API, because it allows one to download only the data needed >> (something I've been pushing for a BP about for a long time, BTW). >> Regarding other transport protocols than HTTP, I'm not sure what that >> has to do with the intended outcome. >> >> As for the third paragraph, again, I don't think we should get into the >> how-to-implement-REST discussion here. There is another BP for that. >> Also, the suggestion that creating a web API for relational data is >> "elementary programming" whereas RDF "can be provided with more >> sophisticated APIs" strikes me as potentially a bit insulting to devs >> who work with relational data. >> >> I'm curious what the goal of this reworking was. Perhaps we can find >> other ways to address the underlying issues. >> -Annette >> >> On 12/11/15 7:08 AM, Phil Archer wrote: >>> This should be in the mail archive (Peter used an alternative e-mail >>> address which is why it bounced) >>> >>> >>> -------- Forwarded Message -------- >>> Subject: Best Practice 26.docx >>> Date: Fri, 11 Dec 2015 14:43:55 +0000 >>> From: Peter.Winstanley@gov.scot >>> To: public-dwbp-wg@w3.org >>> CC: phila@w3.org, laufer@globo.com >>> >>> >>> >>> I have tried make some steps to improve the BP #26 from >>> http://w3c.github.io/dwbp/bp.html#useanAPI >>> >>> Hope it is a helpful move. It you think the direction is right then >>> let me know and I'll complete. >>> >>> Peter >>> >>> >>> ********************************************************************** >>> This e-mail (and any files or other attachments transmitted with it) >>> is intended solely for the attention of the addressee(s). Unauthorised >>> use, disclosure, storage, copying or distribution of any part of this >>> e-mail is not permitted. If you are not the intended recipient please >>> destroy the email, remove any copies from your system and inform the >>> sender immediately by return. >>> >>> Communications with the Scottish Government may be monitored or >>> recorded in order to secure the effective operation of the system and >>> for other lawful purposes. The views or opinions contained within this >>> e-mail may not necessarily reflect those of the Scottish Government. >>> >>> >>> Tha am post-d seo (agus faidhle neo ceanglan còmhla ris) dhan neach >>> neo luchd-ainmichte a-mhà in. Chan eil e ceadaichte a chleachdadh ann >>> an dòigh sam bith, aâ toirt a-steach còraichean, foillseachadh neo >>> sgaoileadh, gun chead. Ma âs e is gun dâfhuair sibh seo le >>> gun fhiosdâ, bu choir cur à s dhan phost-d agus lethbhreac sam bith >>> air an t-siostam agaibh, leig fios chun neach a sgaoil am post-d gun >>> dà il. >>> >>> Dhâfhaodadh gum bi teachdaireachd sam bith bho Riaghaltas na h-Alba >>> air a chlà radh neo air a sgrùdadh airson dearbhadh gu bheil an >>> siostam ag obair gu h-èifeachdach neo airson adhbhar laghail >>> eile. Dhâfhaodadh nach eil beachdan anns aâ phost-d seo co-ionann >>> ri beachdan Riaghaltas na h-Alba. >>> ********************************************************************** >>> >>> >>> >>> The original of this email was scanned for viruses by the Government >>> Secure Intranet virus scanning service supplied by Vodafone in >>> partnership with Symantec. (CCTM Certificate Number 2009/09/0052.) >>> This email has been certified virus free. >>> Communications via the GSi may be automatically logged, monitored >>> and/or recorded for legal purposes. >>> >>> >>> > -- +32486747122 Linked Open Transport Data researcher UGent - MMLab - iMinds Board of Directors Open Knowledge Belgium http://openknowledge.be Open Transport working group coordinator at Open Knowledge International http://transport.okfn.org
Received on Tuesday, 22 December 2015 09:06:40 UTC