Re: Subsetting BP

Annette,

+1

>From the UX perspective (web browser + smart phone app) other measures
might be the:

 * number of consumer processing steps reduced through subsetting (
compared to collecting the entire dataset)

 * resulting size of memory footprint provided by subsetting versus
footprint required by dataset.

Kind regards,

Eric S


On Wed, Mar 23, 2016 at 4:45 PM, Annette Greiner <amgreiner@lbl.gov> wrote:

> I like the idea of a test that updates over time like that. I think a goal
> could be to enable a web app to pull down each bit of data within 10
> seconds on a consumer-level network. Ten seconds is a rule of thumb in UX
> circles for what feels like a reasonable time to wait for an app to
> respond, *if* the user is given an indicator that the app is working. That
> would still make sense over time as networks get faster.
> -Annette
>
>
> On 3/23/16 3:33 PM, Phil Archer wrote:
>
>> Thanks Eric,
>>
>> Newton and Bernadette were able to join us and we had a useful discussion
>> about subsetting. The minutes are at
>> https://www.w3.org/2016/03/23-sdwcov-minutes. My understanding was that,
>> as we discussed in our own call earlier, the difficulty is that it is
>> almost impossible to talk about this in the abstract.
>>
>> Jeremy Tandy said: it makes sense to for dwbp to provide some advice -- if
>>     you have data that is too big for a web application then
>>     providew a mechanism to get hold of bits of it
>>     ... eg. using predefined slices or an API
>>     ... test by "here is a massive dataset -- can you work with it
>>     in a browser app?
>>
>> So my understanding - and it is no more than my understanding which may
>> be inaccurate - is that there is agreement on:
>>
>> - bulk download is a BP, meaning, you should make all the data available
>> for download, probably not in real time, for local processing.
>>
>> - If the dataset is large, it's a good idea to make subsets available,
>> which can be done through an API and/or through defining subsets and giving
>> them identifiers.
>>
>> - What that API looks like, or how to construct those URIs is always
>> going to be specific to the dataset.
>>
>> What is not clear is whether we can create a genuine BP around this.
>>
>> Newton (rightly) asks how you can test it. Jeremy suggested - but it was
>> in the hoof and shouldn't be taken as gospel - that a test might be whether
>> the dataset is processable within a browser. Today's browsers can handle
>> around 40MB without breaking into a sweat - 10 years ago, 1 MB might have
>> caused problems, so the test advances with time nicely.
>>
>> IMHO, what Annette wrote is right (or very close to it), and the single
>> bus route example is a good one; but I know we haven't reached a consensus
>> view.
>>
>> We could readily add in another example and could, perhaps, explicitly
>> talk about spatial coverages, payments data, and statistics as examples of
>> datasets that can be very large but for which many applications only ever
>> want a subset.
>>
>> On your question about regular time slots, no, the time is about to
>> change. The switch to DST in the northern hemisphere and away from it in
>> the south means SDW is about to switch time slots. I can advise when the
>> new time has been decided, but it's likely to be between 6 and 8 am your
>> time.
>>
>> Phil.
>>
>> Phil
>>
>> On 23/03/2016 21:04, Eric Stephan wrote:
>>
>>> Phil,
>>>
>>> I  just saw this note, thanks for reaching out, it would have been nice
>>> to
>>> participate.   If this is a reoccurring meeting time I'd like to
>>> participate especially with the DUV activities winding down.
>>>
>>> Kind regards,
>>>
>>> Eric S
>>>
>>> On Wed, Mar 23, 2016 at 9:13 AM, Phil Archer <phila@w3.org> wrote:
>>>
>>> Just to let DWBP folks know that the Subsetting BP [1] is on the agenda
>>>> for one of the Spatial data WG's sub group calls which takes place at
>>>> 20:00
>>>> UTC today (13:00 for Annette and Eric, 20:00 UK, 21:00 CET).
>>>>
>>>> I dare say that Bill Roberts, chair of that subgroup, would be happy for
>>>> anyone in DWBP who wishes to join that call. Details at [2].
>>>>
>>>> Legal disclaimer:
>>>> Please note that the SDW WG is run jointly with the OGC and therefore
>>>> the
>>>> output will be a joint OGC/W3C specification. In addition to the usual
>>>> W3C
>>>> rules, the (almost exactly the same) rules apply for OGC, it's just
>>>> handled
>>>> differently, See [3].
>>>>
>>>> Phil.
>>>>
>>>> [1] http://w3c.github.io/dwbp/bp.html#EnableDataSubsetting
>>>> [2]
>>>> https://www.w3.org/2015/spatial/wiki/Meetings:Coverage-Telecon20160323
>>>> [3] https://www.w3.org/2015/spatial/wiki/Patent_Call
>>>>
>>>> --
>>>>
>>>>
>>>> Phil Archer
>>>> W3C Data Activity Lead
>>>> http://www.w3.org/2013/data/
>>>>
>>>> http://philarcher.org
>>>> +44 (0)7887 767755
>>>> @philarcher1
>>>>
>>>>
>>>>
>>>
>>
> --
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory
>
>
>

Received on Thursday, 24 March 2016 02:31:16 UTC