Re: Catralog vocab question: Indicating whether "government" dataset has "official" status

Thank you very much to those who have responded (thus far...) to my
question. A couple notes:

Martin asks:
> why is marking a datatset as 'official' so important out of your opinion - and/or why for DCAT - as there is for sure information about the publisher & a license given within the metadata - this should be enough basic provenance information (for sure it would be better to have comprehensive provenance info here) to find out / evaluate e.g. the quality of a dataset et al....furthermore by mapping / linking the publisher info with a publisher directory of another source that includes 'official status information' could solve the problem

* Having unambiguous URIs for (a) the dct:publisher and (b)
dct:license might be a start, *if* we could rely on (in the case of
(a)) registries that classified publishers and (b) on licenses
expressed as Linked Data, and which actually contained the answer.
* A use case might be a harvester that is de-referencing catalog URIs
(as if they existed...) and dataset URIs for the purposes of
aggregation.
* With so many ways to record provenance, and many providers *not*
doing so, I'm concerned about relying on inference based on these
records to determine the "official" status of datasets
* Another way to answer the question, or an addition piece of
information, might be to indicate whether the catalog and/or dataset
is "Authoritative." <http://data.soton.ac.uk> usefully does this; as I
understand it, their working definition is whether the dataset is an
ad hoc creation or a by-product of an official system.

You can get into nit-picking; for example, catalog and dataset
metadata scraped (non-authoritative) from a government site
(authoritative, official) because they don't provide the metadata...

John

David notes:
> Since the UN can publish drafts as well as certified datasets, it seems like this requires at least a classification of organizations that publish linked data and a classification of individual datasets, and perhaps a third being the classification of catalogs themselves although not sure how useful that is unless some aggregators are not trusted...

* So I think this sort of "certification" of at least
publishers/providers would work for particular kinds of certification
--- some office of the UN denoting a country's "official" provider.
One could even imagine how delegation would work.
* David's proposed system of "levels of approval" would work, esp. if
the relying service checked where assertions came from
* To be consistent with the Open World assumption, there does also
need to be a way that status can be expressed in a de-centralized way.
It's up to the policies governing the relying system how to use the
assertions that it finds...

John

On Fri, Sep 9, 2011 at 6:33 AM, David Price <dprice@topquadrant.com> wrote:
> Some organizations have more authority than others - so UK Home Office might
> publish linked data, but so might ISO or the UN or TopQuadrant. Some
> datasets have different levels of 'official-dom' - drafts vs. recommended
> for certain uses vs. certified as accurate and complete. Since the UN can
> publish drafts as well as certified datasets, it seems like this requires at
> least a classification of organizations that publish linked data and a
> classification of individual datasets, and perhaps a third being the
> classification of catalogs themselves although not sure how useful that is
> unless some aggregators are not trusted.
>
> At least one use case I've seen is that in some large organizations, when
> starting a new programme they select resources to use based on a preferred
> sequence of authorities and levels of approval (i.e. ISO International
> Standards, and if not available W3C Recommendations, and if not available
> ISO Technical Specifications, and if not available UK government agencies,
> and if not available ...). I know this use case is applicable to
> organizations as diverse as ISO in deciding normative references when making
> standards and in US DOD when approving resources for a new equipment or
> research programme.
>
> Cheers,
> David
>
> On 9/8/2011 10:33 AM, Richard Cyganiak wrote:
>>
>> On 6 Sep 2011, at 21:27, John Erickson wrote:
>>>
>>> Questions have arisen as to how to indicate the "official" status of a
>>> catalog and/or individual dataset. For example, there are a large
>>> number of datasets that are the only source of data for a country but
>>> are "Non-government." No properties in DCAT [1] or our own prototype
>>> [2] express this adequately. This is important because consumers of
>>> catalog metadata must be able to determine whether a source has
>>> official status or not...
>>
>> You use scare quotes around the words “official” and “non-government”.
>>
>> Can you give a better definition of the distinction you're drawing?
>>
>> What's the use case for this?
>>
>> Best,
>> Richard
>>
>
>
> --
> Managing Director and Consultant
> TopQuadrant Limited. Registered in England No. 05614307
> UK +44 7788 561308
> US +1 336-283-0606
>
>
>
>
>
>



-- 
John S. Erickson, Ph.D.
Dir, Web Science Ops, Tetherless World Constellation (RPI)
<http://tw.rpi.edu>
olyerickson@gmail.com
Twitter: @olyerickson
Skype: @olyerickson

Received on Friday, 9 September 2011 18:17:57 UTC