W3C home > Mailing lists > Public > public-sdw-wg@w3.org > April 2016

RE: action-152: "subsetting"

From: Linda van den Brink <l.vandenbrink@geonovum.nl>
Date: Fri, 1 Apr 2016 06:49:47 +0000
To: Kerry Taylor <kerry.taylor@anu.edu.au>, Maik Riechert <m.riechert@reading.ac.uk>, Bill Roberts <bill@swirrl.com>
CC: SDW WG Public List <public-sdw-wg@w3.org>
Message-ID: <13F9BF0BE056DA42BFE5AA6E476CDEFE725C5AD7@GNMSRV01.gnm.local>
I don't care that much, but I like 'extract' better than subsetting.

Van: Kerry Taylor [mailto:kerry.taylor@anu.edu.au]
Verzonden: vrijdag 1 april 2016 04:59
Aan: Maik Riechert; Bill Roberts
CC: SDW WG Public List
Onderwerp: RE: action-152: "subsetting"

Thanks for your comments  Maik and Bill!

I could live with "filter" too -although it carries a notion of dynamic behaviour - but  also a big improvement over "subsetting"!

As neither of you seem particularly concerned, and there is no other comment so far, maybe  this is a non-issue.

Does anyone else care? If not, I can close the action and drop my objection to "subsetting".

Kerry

From: Maik Riechert [mailto:m.riechert@reading.ac.uk]
Sent: Thursday, 31 March 2016 8:57 PM
To: Bill Roberts <bill@swirrl.com<mailto:bill@swirrl.com>>; Kerry Taylor <kerry.taylor@anu.edu.au<mailto:kerry.taylor@anu.edu.au>>
Cc: SDW WG Public List <public-sdw-wg@w3.org<mailto:public-sdw-wg@w3.org>>
Subject: Re: action-152: "subsetting"

I think "extract" is sometimes more natural when speaking about it (says the German guy...), for example:
You can extract a vertical slice from a 4D grid coverage.
vs
You can subset a 4D grid coverage to a vertical slice.

Personally, I always use subset, just because you got to use something and subset is not overloaded that much sometimes.

Having said that, there are also collections of coverages. And in that case I usually speak of a filtered collection when I select coverages according to some criteria. But this is really just because collection filtering is an established term elsewhere. And of course you could also filter a satellite image so that it only includes the parts within a bounding box (a min max filter on latitude/longitude for example).

So, extract, subset, filter, it's all the same to me really. It's just that in some sentences/contexts one or the other sounds better because it is either more common or more natural. I agree though that "subset" is not common in the webby world, and I would say that "extract" is more associated with file unzipping.

Maik
Am 29.03.2016 um 13:22 schrieb Bill Roberts:
Hi Kerry

I find the notion of subsets of datasets a reasonable one. I acknowledge that 'subsetting' is a relatively ugly neologism (though there are a lot worse made-up words at use in the world of technology!) But I'd be happy to use your suggested alternative of 'extract' and 'extracting'.

Cheers

Bill



On 29 March 2016 at 13:10, Kerry Taylor <kerry.taylor@anu.edu.au<mailto:kerry.taylor@anu.edu.au>> wrote:
I have an objection to the use of the word "subsetting", prominent in the spatial community and leaking also into other "big data" technology discussions. It seems to have some heritage in the statistical community, too.
I partly dislike it because it is not a word, but also because the notion of a 'subset' feels wrong, as it treats a 'dataset' as an unstructured 'set' of things, whereas this is very rarely the case when "subsetting" is required.
The formal (and widely understood) mathematical notion of sets seems inappropriate.
Normally, the known structure is a very important part of the "subsetting" operation.

I do not think that "subsetting"  carries the intended meaning to the audience for whom our writing is directed - at least not to the "webby but not spatial expert" audience. I note ( probably due to our influence) DWBP is now also speaking of 'subsetting'.

I have some suggested alternatives I raise for consideration by the SDW, ordered best-first in my opinion.

Noun     Verb
extract  extracting
snippet  snipping
selection selecting
snip  snipping


--Kerry
Received on Friday, 1 April 2016 06:50:19 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:20 UTC