RE: ISSUE-36 (Kill Radion?): Should RADion be killed off? [DCAT]

Phil,

Given my definition of data set and your definition of semantic asset, I would say a semantic asset is definitely a data set.

I am not quite clear how the PDF example refers back to whether a semantic asset is a data set.  However, if I create a PDF with the following content - The US unemployment rate for August 2012 is 8.1% - how is that file not both?

The problem could be that what we as humans recognize as content in each case is not immediately retrievable by computer - likely not in the case of a PDF.  But whether some data are machine accessible or not doesn't alter their fundamental character.

Yours,
Dan



-----Original Message-----
From: Phil Archer [mailto:phila@w3.org] 
Sent: Thursday, September 27, 2012 12:29 PM
To: Gillman, Daniel - BLS
Cc: public-gld-wg@w3.org
Subject: Re: ISSUE-36 (Kill Radion?): Should RADion be killed off? [DCAT]



On 27/09/2012 16:56, Gillman, Daniel - BLS wrote:
> Dave et al,
>
> I don't know if this issue of data sets was resolved to everyone's satisfaction, but I can put in my 2 pence:
>
> Any file stored on disk is a data set.  As long as it contains bits and bytes, even those that are to be interpreted as instructions, are data sets in some sense.  So we can only distinguish data sets by what we want to do with the data inside, and that, I feel, boils down to the operations we can perform on them.  These are provided by the datatype and MIME type for the data under question.
>
> So, a file intended for human reading, say a text file created in MS-Word, will have a simple datatype (string?) that allows for very few operations.  A statistical data set might have a combination of categorical and quantitative data (in columns).  Finally, a bit map of a photograph taken on Mars will be interpretable by a photo reader or some other software.

I can't disagree with you directly, Dan, but your definition of dataset sounds close to what I think of as an Information Resource - basically anything that can be sent over the wire (and please no, let us not descend into another HR14 thread!!)

DCAT defines Dataset as "A collection of data, published or curated by a single source, and available for access or download in one or more formats."

The EU defines a semantic asset thus: "a semantic interoperability asset as highly reusable metadata (e.g. xml schemata, generic data models) and reference data (e.g. code lists, taxonomies, dictionaries, vocabularies) which is used for eGovernment system development."

So to take your angle we should define adms:SemanticAsset as a sub class of dcat:Datatset (which is what Dave said originally). However, if I download a Dataset and find it's a PDF, well, then we're getting into the whole discussion about whether publishing a PDF counts as publishing data.

It's clear that some things are both a dataset and an Semantic Asset, but there is a distinction between the two. IMO, the real question is whether making that distinction is useful to anyone. If not, RADion goes. If so, well, we should probably keep it.

Phil.



>
> Dan Gillman
> Bureau of Labor Statistics
> Office of Survey Methods Research
> 2 Massachusetts Ave, NE
> Washington, DC 20212 USA
> Tel     +1.202.691.7523
> FAX    +1.202.691.7426
> Email  Gillman.Daniel@BLS.Gov
> -----------------------------------------
> "Whatever it is, I'm against it!
> No matter what it is or who commenced it, I'm against it!"
> ~ Groucho Marx
> ------------------------------------------
>
>
>
> -----Original Message-----
> From: Dave Reynolds [mailto:dave.e.reynolds@gmail.com]
> Sent: Thursday, September 27, 2012 10:00 AM
> To: Phil Archer
> Cc: public-gld-wg@w3.org
> Subject: Re: ISSUE-36 (Kill Radion?): Should RADion be killed off? [DCAT]
>
> On 27/09/12 14:43, Phil Archer wrote:
>>
>>
>> On 27/09/2012 14:27, Dave Reynolds wrote:
>> [..]
>>
>>> OK. What about the other way round. Are all datasets semantic assets?
>>>
>>
>> Well, again, it depends on the dataset. I can think of datasets that
>> could be considered as semantic assets such as NAPTAN and the
>> Companies House data. These are used as reference points within other datasets.
>> But Suffolk County Council spending data for 2012 Q1... ?
>
> Sure it is, if I'm using that to make a decision and want to refer to it as an officially sanctioned asset I can use for that decision making.
>
> There's an underlying issue here that the notion of "semantic asset" is not intrinsic to the thing. What makes something a semantic asset is that some authority wants to declare it as such by putting it in an approved repository. But one person's semantic asset is another persons useless pile of insufficiently structured irrelevant data.  So it's really a relationship, not a class.
>
> That said, it sounds like there's enough differences between what dcat wants to talk about and what adms wants to talk about that radion has a role at least as an articulation of a mapping.
>
> Dave
>
>

-- 


Phil Archer
W3C eGovernment
http://www.w3.org/egov/


http://philarcher.org

+44 (0)7887 767755
@philarcher1

Received on Thursday, 27 September 2012 19:22:34 UTC