Re: About BP Provide Subsets for Large Datasets

Hi Annette,

I just made the updates on the BP Provide Subsets for large dataset. Please
let me know if this is ok with you.

New example:

The MyCity transit agency has been collecting detailed data about passenger
usage for several years. This is a very large dataset, containing values
for numbers of passengers by transit type, route, vehicle, driver, entry
stop, exit stop, transit pass type, entry time, etc. They have found that a
wide variety of stakeholders are interested in downloading various subsets
of the data. The folks who run each transit system want only the data for
their transit mode, the city planners only want the numbers of entries and
exits at each stop, the city budget office wants only the numbers for the
various types of passes sold, and others want still different subsets. The
agency created a Web site where users can select which variables are of
interest to them, set ranges on some variables, and download only the
subset they need.

New How to test:

Check that subsets of the dataset can be recovered making smaller requests
or downloading smaller units.


2016-05-04 12:07 GMT-03:00 Bernadette Farias Lóscio <>:

> Hi Annette,
> We are resolving the final comments on the DWBP [1] doc and I'd like to
> discuss with you the following ones related to BP Provide Subsets for Large
> Datasets:
> comment 76:
> How to test should say something about all the subets adding up to the
> complete set. Didn't we have a test before that the entire dataset can be
> recovered by making a series of smaller requests? I think we had a note
> that coming up with use cases isn't deterministic enough.
> I made the following proposal based on your comment. What do you think?
> Check that the entire dataset can be recovered by making a series of
> smaller requests.
> comment 41:
> Phil is not sure about include an example of making a set of PDFs
> available and I agree with him. I know that you think that is not realistic
> to use CSV, but I think it is not good to use PDF.
> If you don't have another example to replace this one, can we use CSV
> instead of PDF?
> Thanks!
> Berna
> [1]
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
> ----------------------------------------------------------------------------

Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil

Received on Wednesday, 4 May 2016 21:07:09 UTC