W3C home > Mailing lists > Public > semantic-web@w3.org > June 2017

Fwd: using DCAT for scraped data

From: Phil Archer <phila@w3.org>
Date: Sun, 4 Jun 2017 20:29:57 +0100
To: public-dxwg-comments@w3.org
Cc: Semantic Web IG <semantic-web@w3.org>, Cristiano Longo <longo@dmi.unict.it>
Message-ID: <a5d126ef-79c5-ea7f-eb18-970e8da8247e@w3.org>
Forwarding to the Dataset Exchange WG [1], recently launched, which is 
chartered to work on DCAT. This may be a use case.

Phil

[1] https://www.w3.org/2017/dxwg/

-------- Forwarded Message --------
Subject: using DCAT for scraped data
Resent-Date: Sat, 03 Jun 2017 14:33:25 +0000
Resent-From: semantic-web@w3.org
Date: Sat, 3 Jun 2017 16:32:30 +0200
From: Cristiano Longo <longo@dmi.unict.it>
To: semantic-web@w3.org, Alessio Cimarelli <alessio.cimarelli@gmail.com>

Dear All,

I'm writing from an Hackaton at the Open Data Fest 2017 
(opendatafest.it)  in Sicily. We are building an ontology of Albo POP 
(http://albopop.it) using DCAT and its specialization DCAT_ap_it. 
Roughly speaking, an Albo POP is an automated tool which provides an RSS 
feed a set of notices and advices from a Public Administration (usually 
a municipality) by scraping the notices from the web site of the Public 
Administration itself.

We model using dcat the RSS feed we provide as a distribution, but we 
would like to make explicit that the data come from the public 
administration. We adopted the followings:

a) put the notices web page of the public administration by using the 
source property of the dublin core terms  vocabulary, attached to the 
datase;

b) as rights Holder we specify the municipality and

c) as publisher we indicate the developer who created the scaper which 
converts the notices page to RSS.

An example is attached to this mail.

We would like to know if this approach may be considered acceptable. Any 
suggestion is welcome.

Thanks in advance,

Cristiano Longo






Received on Sunday, 4 June 2017 19:29:50 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:50 UTC