- From: Annette Greiner <amgreiner@lbl.gov>
- Date: Wed, 6 Apr 2016 14:45:47 -0700
- To: public-dwbp-wg@w3.org
- Message-ID: <5705838B.8040503@lbl.gov>
Hi Bernadette, Please see my comments inline. Thanks for your diligence! -Annette On 4/5/16 8:40 PM, Bernadette Farias Lóscio wrote: > > Hi all, > > I am reviewing the DWBP document and I have some comments/questions > about the Data Access Section. > > @Annette, as you wrote big part of this section, I'd like to kindly > ask your help with the following comments. > > 1. Introduction > > I’m not sure if the following paragraph fits in this section: > > On a further note, it can be observed that data on the Web is > essentially about the description of entities identified by a unique, > Web-based, identifier (an URI). Once the data is dumped and sent to an > institute specialised in digital preservation the link with the Web is > broken (dereferencing) but the role of the URI as a unique identifier > still remains. In order to increase the usability of preserved dataset > dumps it is relevant to maintain a list of these identifiers. I agree. I don't think that fits. > > 2. BP 19 Provide bulk download > > Data or datasets should be available for bulk download? I think the BP > should refer to datasets instead of data. I think the meaning of bulk > download should be more clear. I think "datasets" is fine, as you suggest. > > I don’t understand this phrase: “When Web data is distributed across > many URIs but might logically be organized as one container, accessing > the data in bulk can be useful." Again, I think the BP should consider > datasets instead of data. As I understand it, the idea is that, if you have data that would logically be organized as a dataset but it is spread over multiple endpoints (for example, it's available piecewise through an API or through subsets for download), so that getting a copy of the entire dataset would require multiple requests, that would be a pain in the neck to reassemble as the complete dataset. Since it's referring to the dataset being broken up, "data" makes more sense. Does it help to s/container/dataset/? > > I’m not sure if I understood the example. Is one dataset with multiple > CSV files? or multiple datasets each one with a CSV distribution? The > bulk download contains one dataset or multiple datasets? It's probably best to think of it as one dataset with multiple CSV files. The bulk download contains one dataset. But the definition of a dataset is pretty flexible, and one person's dataset is another person's collection or subset, so the term "dataset" can be confusing in this context. > > 3. Best Practice 20: Provide Subsets for Large Datasets > > In the example, can we use CSV format instead of PDF format? I was trying to keep it realistic, thinking of what transit agencies really do. I suppose we could use CSV, but it would be less realistic. I think PDF is fine in addition to having an API. > > R-Citable is an evidence for this BP? Having a separate URI for the subset makes the subset citable. > > 4. BP 23 Provide data up to date > > The description of BP 23 says: “Data must be available in an > up-to-date manner and the update frequency made explicit. " But the BP > doesn’t mention how to make the update frequency available. I suggest > to remove “and the update frequency made explicit" from the description. Yeah, the update frequency often is not predictable. I do like the idea of reporting the frequency when it is known. If we don't have a recommendation about how to do that, I think we can still suggest that people do it. It looks like DCAT found a way of doing that in machine-readable form [1], though the link resolves to a page that doesn't look very official. If nothing else, one can include a textual statement in the documentation. > > 5. BP 25 : Use Web Standards as the foundation of your API" > Is possible to rewrite the description of the BP to make the text > smaller? In general, BP descriptions are one or two lines. > I agree it's awfully long. I'd suggest "When designing APIs, use an architectural style that is founded on the technologies of the Web itself." If some people insist that we need to list the technologies, we could say "When designing APIs, use an architectural style that is founded on the technologies of the Web itself, such as URIs, HTTP verbs, HTTP response codes, MIME types, typed HTTP Links, and content negotiation." > I’m not sure if the example is suitable for this BP. Maybe the example > needs a better explanation or the BP needs a better example :) That example shows what makes a hypermedia API a hypermedia API. I would want to keep that but maybe add an example for REST more generally. It's difficult for me to think of a way to show an example of a REST API, though, other than linking to one (possibly https://w3c.github.io/w3c-api/). Or do we want to build and host a little example REST API for the transit agency? > > The same for the the How to test section: “Check that the service > avoids using http as a tunnel for calls to custom methods, and check > that URIs do not contain method names”. I don’t see how this is a test > about using Web standards. The way to implement a nonstandard architecture on the web is to hide it within standard calls. Using http as a tunnel for custom methods rather than using http itself is symptomatic of not using http for anything other than a transport mechanism. URIs that contain method names are a dead giveaway that one is inventing new methods rather than using http verbs and URIs. > > 6. BP 26: Provide complete documentation for your API > > It would be better if the example of this BP should be related with > the bus stops example. I agree. Maybe we need to implement an example transit API doc site in Swagger or something. If we want an equally nice example as the pet store one, that's not trivial. > > I think the following phrases should be on the approach to > implementation and not on the how to test section: “The quality of > documentation is also related to usage and feedback from developers. > Try to get constant feedback from your users about the documentation." I agree. > > 7. BP 27 Avoid Breaking Changes to Your API > > The how to test section seems more like an approach to implementation > than to a test. Is it possible to rewrite? I disagree. The bit about testing shows how to test that changes to the API do not break it, which is not the same as showing how to implement changes to the API. It is literally how to test it. > > It would be great to have an example that also uses the bus stop > dataset. Maybe the example of BP 27 can be related with the example of > BP 26. Maybe we could add something like this: Suppose the MyCity transit agency's API responds to a request for a certain bus's arrival time at a single station as http://api.mycitytransit.example.org/arrivals/buses/53/stop/12, but the agency decides it wants to make it possible to query for a range of stops at once. Rather than change the form of the request to require a range, like http://api.mycitytransit.example.org/arrivals/buses/53/stop/12-12, the agency can keep the old API call and add a new one for multiple arrivals, like http://api.mycitytransit.example.org/arrivals/buses/53/stops/1-12. > > Thanks a lot! > Bernadette > > > -- > Bernadette Farias Lóscio > Centro de Informática > Universidade Federal de Pernambuco - UFPE, Brazil > ---------------------------------------------------------------------------- [1] https://www.w3.org/TR/vocab-dcat/ " In order to express frequency of update in the example above, we chose to use an instance from the Content-Oriented Guidelines <http://www.w3.org/TR/vocab-data-cube/#dsd-cog> developed as part of the W3C Data Cube Vocabulary efforts." -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
Received on Wednesday, 6 April 2016 21:46:18 UTC