Re: Organizing Use Cases for F2F / Agenda proposal from Riccardo Albertoni on 2017-07-13 (public-dxwg-wg@w3.org from July 2017)

From: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
Date: Thu, 13 Jul 2017 12:26:17 +0200
To: Makx Dekkers <mail@makxdekkers.com>
Cc: kcoyle@kcoyle.net, Jaroslav Pullmann <jaroslav.pullmann@fit.fraunhofer.de>, Dataset Exchange Working Group <public-dxwg-wg@w3.org>
Message-ID: <CAOHhXmSV9pemusD0oeszVqsi68X=Tx5jaMDhd-VQPc9c37ZV1A@mail.gmail.com>
Dear Makx,  and all,

The following  SPARQL  query should help in the analysis you are suggesting
( see http://yasgui.org/short/Hy8HtaVS-  for a nicer view). It counts the
use of properties for DCAT Entities.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dct: <http://purl.org/dc/terms/>

SELECT  ?prop (Count( distinct ?sub) as ?numOfDCATEntityUsingTheProperty)
 WHERE {
  {
  ?sub a  dcat:Dataset;
  ?prop ?obj.
  } union
  {
  ?sub a dcat:Distribution;
  ?prop ?obj.
  } union
  {
  ?sub a  dcat:Catalog;
  ?prop ?obj.
  }Union
  {
  ?sub a dcat:CatalogRecord;
  ?prop ?obj.
  }
} Group by  ?prop

For example,
If I try the above query on the
https://www.europeandataportal.eu/sparql-manager/en/, it returns  the
following results

prop,numOfDCATEntityUsingTheProperty
http://purl.org/dc/terms/conformsTo,75482
http://purl.org/dc/terms/provenance,414727
http://www.w3.org/ns/adms#identifier,749805
http://www.w3.org/ns/dcat#themeTaxonomy,79
http://xmlns.com/foaf/0.1/primaryTopic,749805
http://purl.org/dc/terms/temporal,99731
http://www.w3.org/ns/adms#status,749805
http://www.w3.org/ns/dcat#byteSize,13568
http://spdx.org/rdf/terms#checksum,6507
http://purl.org/dc/terms/publisher,309489
http://purl.org/dc/terms/language,418733
http://www.w3.org/ns/dcat#theme,453902
http://purl.org/dc/terms/relation,1
http://www.w3.org/1999/02/22-rdf-syntax-ns#type,2418170
http://purl.org/dc/terms/modified,1367877
http://purl.org/dc/terms/format,105731
http://purl.org/dc/terms/issued,1143662
http://xmlns.com/foaf/0.1/homepage,18
http://xmlns.com/foaf/0.1/page,74839
http://www.w3.org/ns/dcat#mediaType,240863
http://www.w3.org/ns/dcat#accessURL,909707
http://purl.org/dc/terms/accrualPeriodicity,342936
http://www.w3.org/ns/dcat#distribution,367600
http://purl.org/dc/terms/spatial,527782
http://purl.org/dc/terms/rights,67717
http://purl.org/dc/terms/description,1256889
http://www.w3.org/ns/dcat#contactPoint,332807
http://www.w3.org/ns/dcat#keyword,701558
http://www.w3.org/ns/dcat#landingPage,281865
http://www.w3.org/ns/dcat#downloadURL,239767
http://purl.org/dc/terms/title,1457491
http://www.w3.org/ns/dcat#record,77
http://purl.org/dc/terms/license,253977
http://purl.org/dc/terms/identifier,749707
http://www.w3.org/ns/dcat#dataset,75

Cheers,
Riccardo

On 13 July 2017 at 11:11, Makx Dekkers <mail@makxdekkers.com> wrote:

> Jaroslav, Karen,
>
> It is indeed good to consider actual usage of DCAT in real life.
>
> >From my point of view, it would be really interesting to have a
> statistical analysis on the use of the various elements of DCAT. There are
> large collections of DCAT data available for analysis -- for example
> https://www.europeandataportal.eu/ makes available descriptions of
> three-quarters of a million datasets in DCAT through their SPARQL endpoint.
> It's just a matter of someone having the time and resources to do such an
> analysis.
>
> For a list of countries and organisations in Europe that have DCAT
> profiles already in operation, see https://ec.europa.eu/isa2/
> solutions/dcat-application-profile-data-portals-europe_en.
>
> Makx.
>
>
> -----Original Message-----
> From: Karen Coyle [mailto:kcoyle@kcoyle.net]
> Sent: 13 July 2017 00:45
> To: Jaroslav Pullmann <jaroslav.pullmann@fit.fraunhofer.de>
> Cc: public-dxwg-wg@w3.org
> Subject: Re: Organizing Use Cases for F2F / Agenda proposal
>
> Thanks, Jaroslav,
>
> You may have seen that Caroline and I have put together an agenda for the
> meeting. Unfortunately I begin traveling tomorrow so she and I will
> probably not meet again before Monday, but we will compare your list with
> ours and we can always make adjustments as we discuss during the F2F. I
> think that the categorizing of the use cases into groups is very useful,
> and seeing them from different points of view also helps.
>
> As for the introductory part that you propose, some of that may be best
> suited to the DCAT 1.1 sub-group, which hopefully will begin meeting soon.
> It does seem logical that a first step would be to survey the uses of DCAT
> and the DCAT-related APs that exist today. I would definitely encourage the
> sub-group to get started on such an analysis as soon as possible. That
> group, of course, can also come back with new use cases and requirements to
> propose to the "plenary" working group.
>
> Let's try to use some of our time (including coffee breaks) to get some
> action going in the sub-groups that are forming.
>
> kc
> p.s. From my Dublin Core perspective I totally agree with your "simpler is
> better" hypothesis. For DC I did some similar work as you propose:
> * https://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-
> original-15.html
> * https://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-dcterms.html
> * https://kcoyle.blogspot.com/2013/10/dublin-core-usage-in-lod.html
>
> That last one seems to prove your point.
>
> On 7/12/17 3:17 PM, Jaroslav Pullmann wrote:
> >
> >   Dear Karen, dear all
> >
> >   within an introductory part we may want to summarize the Status quo.
> Some questions to look at in a session 1) were:
> >
> >      - How is DCAT currently used?
> >      - Do data portals, catalogs etc. use DCAT core model or rely on a
> specific application profile?
> >      - What is the average complexity of DCAT Dataset descriptions?
> >      - Are some partitions of the vocabulary seldomly used or not at all
> (e.g. CatalogRecord, theme)? - This will inform about deprecation candidates
> >      - What user/service interfaces and search options are available in
> order to exploit the DCAT potential?
> >
> >   Would the above support a hypothesis, that the main and obvious
> benefit of DCAT is its simplicity
> >   and brevity where, in the end, only a subset of the vocabulary is
> used? If this is the case, does
> >   it imply to be conservative in extending the core model and add only
> few, well argumented properties
> >   while substantially extending guidance on using the existing ones?
> >
> >  Afterwards I suggest to sort and discuss the UCs according to a
> > user/task-oriented perspective, i.e. how would the various
> > stakeholders make use of the DCAT concepts in order to perform their
> > task (describe/publish/find/retrieve a data set etc.)
> >
> >  *Catalog*
> >
> >    Some motivating questions:
> >
> >     - How are the Catalogs found at all, e.g. using Web search engines
> that evaluate RDFa metadata?
> >     - How does DCAT support a data publisher to identify an appropriate
> (specialized) Catalog to publish her data?
> >     - E.g. are Concept schemes available/used as a means of annotation
> and browsing?
> >
> >    Session 2)
> >
> >    - ID40: Discoverability by mainstream search engines
> >    - ID35: Datasets and catalogues
> >    - ID25: Distribution and synchronization of catalog information
> >
> >    - Identification of missing UC with regard to Catalog
> >
> >  *DataSet*
> >
> >    Some motivating questions:
> >
> >     - Is there a general guidance on (minimal) amount of detail for
> describing a data set?
> >     - What are the search strategies to look for a data set?
> >     - How does temporal/spatial, keyword or theme annotation help data
> consumers to localise relevant data sets given a particular information
> need?
> >
> >   Session 3 +4)
> >
> >    Annotation properties, description and documentation
> >    - ID33: Summarization/Characterization of datasets
> >
> >   Dataset concept analysis
> >   - ID8: Scope or type of dataset with a DCAT description
> >   - ID20: Modelling resources different from datasets
> >   - ID36: Cross-vocabulary relationships (comparision of Dataset
> > concepts)
> >
> >   Dataset (& Distribution) versioning
> >   - ID6: Dataset Versioning Information
> >
> >   Dataset co-relation and organization
> >   - ID32:Relationships between Datasets
> >
> >   Dataset semantics
> >   - ID7: Support associating fine-grained semantics for datasets and
> > resources within a dataset
> >
> >   Session 5 + 6)
> >
> >   Data quality, precision and accuracy
> >   - ID15: Modeling data precision and accuracy
> >   - ID16: Modeling conformance test results on data quality
> >   - ID14: Data quality modeling patterns
> >   - ID23: Data Quality Vocabulary
> >
> >   Scope and context
> >   - ID28: Modeling reference systems
> >   - ID29: Modeling spatial coverage
> >   - ID38: Time-related aspects
> >   - ID27: Modeling temporal coverage
> >
> >   Provenance, actors and obligations
> >    - ID12: Modeling data lineage
> >    - ID13: Modeling agent roles
> >    - ID31: Modeling funding sources
> >
> >    Usage control
> >    - ID17: Data access restrictions
> >
> >    - Identification of missing UC with regard to Dataset
> >
> >  Tuesday
> >
> >   Session 7)
> >
> >   *Distribution*
> >
> >    Some motivating questions:
> >    - Are there distribution patterns evident from 1)?
> >    - How is an interactive Distribution endpoint to be described to
> enable human or service-based interaction?
> >    - Should we consider PUB/SUB protocols like MQTT?
> >
> >   - ID1: DCAT packaged distributions
> >
> >   Interactive, dynamic access
> >   - ID6: DCAT Distribution to describe web services
> >   - ID18: Modeling service-based data access
> >   - ID21: Machine actionable link for a mapping client
> >   - ID22: Template link in metadata
> >
> >   - ID34: Relationships between Distributions of a Dataset
> >
> >   - Identification of missing UC with regard to Distributon
> >
> >  Session 8 + 9)
> >
> >  *DCAT Profiles*
> >
> >   - ID10: Requirements for data citation
> >   - ID10: Common requirements for scientific data
> >   - ID24: Harmonising INSPIRE-obligations and DCAT-distribution
> >   - ID37: Europeana profile ecosystem
> >
> >  *Profile querying and negotiation*
> >
> >   - ID2: Specifying media type interpretation beyond the content type
> >   - ID3: Combining multiple types of content sections in a single
> response
> >   - ID5: Discover available content profiles
> >   - ID30: Standard APIs for metadata profile negotiation
> >
> >   - Identification of missing UC with regard to Profile handling
> >
> >  Session 10)
> >
> >  *Metamodel*
> >   - ID11: Modeling identifiers and making them actionable
> >   - ID19: Guidance on the use of qualified forms
> >   - ID26: Extension points to 3rd party vocabularies
> >
> >   - Identification of missing UC with regard to meta-modeling,
> methodology etc.
> >
> >    Given the overall time frame of 12h I assumed approx. 60min + 10 min
> break per session.
> >
> >      Best regards
> >    Jaroslav
> >
> >
> >
> >
> > On Thursday, July 6, 2017 21:13 CEST, Karen Coyle <kcoyle@kcoyle.net>
> wrote:
> >
> >> In preparing for the face-to-face, Caroline and I would like to ask
> >> the group, especially the UCR editors, to suggest what they see as
> >> the logical groupings for our discussion sessions. It would be ideal
> >> for us to have this by the end of the working day (European time) on
> Tuesday.
> >>
> >> We have eight 90-minute slots that we can make use of. If we assume
> >> that at least part of the first slot will be introductions and
> >> establishing an overall working hypothesis, then we have 7 slots in
> >> which to discuss actual use cases. We may also wish to reserve 30
> >> minutes at the end of the second day to prepare a list of missing use
> >> cases and immediate tasks relating to this deliverable.
> >>
> >> Remember that the primary goal of the F2F meeting is to provide the
> >> UCR editors with the information and decisions that they need to
> >> create a First Public Working Draft of the Use Cases and
> >> Requirements. A FPWD is a "heart-beat" document that is not expected
> >> to be final but that gives the W3C management and community an
> >> indication of the direction of the group, as well as proof that it is
> >> indeed getting its work done. We will expect the UCR to be issued in
> >> additional versions as the work progresses. Our goal for the FPWD is
> >> to meet the August 9 W3C deadline for publishing documents, which
> >> means that the group needs to approve the document before that.
> >>
> >> Also, it would be good to have by the end of the Oxford meeting an
> >> idea of how the DCAT group will proceed once the UCR FPWD is in
> >> place. We
> >
> >> should also determine if the work so far informs the Profile and
> >> Content Negotiation groups, or if we have more to do in gathering use
> >> cases in those areas.
> >> --
> >> Karen Coyle
> >> kcoyle@kcoyle.net http://kcoyle.net
> >> m: 1-510-435-8234 (Signal)
> >> skype: kcoylenet/+1-510-984-3600
> >>
> >
> >
> >
> >
>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234 (Signal)
> skype: kcoylenet/+1-510-984-3600
>
>
>


-- 
----------------------------------------------------------------------------
Riccardo Albertoni
Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico
Magenes"
Consiglio Nazionale delle Ricerche
via de Marini 6 - 16149 GENOVA - ITALIA
tel. +39-010-6475624 - fax +39-010-6475660
e-mail: Riccardo.Albertoni@ge.imati.cnr.it
Skype: callto://riccardoalbertoni/
LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
www: *http://www.imati.cnr.it/ <http://www.imati.cnr.it/>*
http://pers.ge.imati.cnr.it/albertoni/PersonalPage/albertoni.html
FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
Received on Thursday, 13 July 2017 10:26:51 UTC