- From: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
- Date: Thu, 13 Jul 2017 12:26:17 +0200
- To: Makx Dekkers <mail@makxdekkers.com>
- Cc: kcoyle@kcoyle.net, Jaroslav Pullmann <jaroslav.pullmann@fit.fraunhofer.de>, Dataset Exchange Working Group <public-dxwg-wg@w3.org>
- Message-ID: <CAOHhXmSV9pemusD0oeszVqsi68X=Tx5jaMDhd-VQPc9c37ZV1A@mail.gmail.com>
Dear Makx, and all, The following SPARQL query should help in the analysis you are suggesting ( see http://yasgui.org/short/Hy8HtaVS- for a nicer view). It counts the use of properties for DCAT Entities. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dcat: <http://www.w3.org/ns/dcat#> PREFIX dct: <http://purl.org/dc/terms/> SELECT ?prop (Count( distinct ?sub) as ?numOfDCATEntityUsingTheProperty) WHERE { { ?sub a dcat:Dataset; ?prop ?obj. } union { ?sub a dcat:Distribution; ?prop ?obj. } union { ?sub a dcat:Catalog; ?prop ?obj. }Union { ?sub a dcat:CatalogRecord; ?prop ?obj. } } Group by ?prop For example, If I try the above query on the https://www.europeandataportal.eu/sparql-manager/en/, it returns the following results prop,numOfDCATEntityUsingTheProperty http://purl.org/dc/terms/conformsTo,75482 http://purl.org/dc/terms/provenance,414727 http://www.w3.org/ns/adms#identifier,749805 http://www.w3.org/ns/dcat#themeTaxonomy,79 http://xmlns.com/foaf/0.1/primaryTopic,749805 http://purl.org/dc/terms/temporal,99731 http://www.w3.org/ns/adms#status,749805 http://www.w3.org/ns/dcat#byteSize,13568 http://spdx.org/rdf/terms#checksum,6507 http://purl.org/dc/terms/publisher,309489 http://purl.org/dc/terms/language,418733 http://www.w3.org/ns/dcat#theme,453902 http://purl.org/dc/terms/relation,1 http://www.w3.org/1999/02/22-rdf-syntax-ns#type,2418170 http://purl.org/dc/terms/modified,1367877 http://purl.org/dc/terms/format,105731 http://purl.org/dc/terms/issued,1143662 http://xmlns.com/foaf/0.1/homepage,18 http://xmlns.com/foaf/0.1/page,74839 http://www.w3.org/ns/dcat#mediaType,240863 http://www.w3.org/ns/dcat#accessURL,909707 http://purl.org/dc/terms/accrualPeriodicity,342936 http://www.w3.org/ns/dcat#distribution,367600 http://purl.org/dc/terms/spatial,527782 http://purl.org/dc/terms/rights,67717 http://purl.org/dc/terms/description,1256889 http://www.w3.org/ns/dcat#contactPoint,332807 http://www.w3.org/ns/dcat#keyword,701558 http://www.w3.org/ns/dcat#landingPage,281865 http://www.w3.org/ns/dcat#downloadURL,239767 http://purl.org/dc/terms/title,1457491 http://www.w3.org/ns/dcat#record,77 http://purl.org/dc/terms/license,253977 http://purl.org/dc/terms/identifier,749707 http://www.w3.org/ns/dcat#dataset,75 Cheers, Riccardo On 13 July 2017 at 11:11, Makx Dekkers <mail@makxdekkers.com> wrote: > Jaroslav, Karen, > > It is indeed good to consider actual usage of DCAT in real life. > > >From my point of view, it would be really interesting to have a > statistical analysis on the use of the various elements of DCAT. There are > large collections of DCAT data available for analysis -- for example > https://www.europeandataportal.eu/ makes available descriptions of > three-quarters of a million datasets in DCAT through their SPARQL endpoint. > It's just a matter of someone having the time and resources to do such an > analysis. > > For a list of countries and organisations in Europe that have DCAT > profiles already in operation, see https://ec.europa.eu/isa2/ > solutions/dcat-application-profile-data-portals-europe_en. > > Makx. > > > -----Original Message----- > From: Karen Coyle [mailto:kcoyle@kcoyle.net] > Sent: 13 July 2017 00:45 > To: Jaroslav Pullmann <jaroslav.pullmann@fit.fraunhofer.de> > Cc: public-dxwg-wg@w3.org > Subject: Re: Organizing Use Cases for F2F / Agenda proposal > > Thanks, Jaroslav, > > You may have seen that Caroline and I have put together an agenda for the > meeting. Unfortunately I begin traveling tomorrow so she and I will > probably not meet again before Monday, but we will compare your list with > ours and we can always make adjustments as we discuss during the F2F. I > think that the categorizing of the use cases into groups is very useful, > and seeing them from different points of view also helps. > > As for the introductory part that you propose, some of that may be best > suited to the DCAT 1.1 sub-group, which hopefully will begin meeting soon. > It does seem logical that a first step would be to survey the uses of DCAT > and the DCAT-related APs that exist today. I would definitely encourage the > sub-group to get started on such an analysis as soon as possible. That > group, of course, can also come back with new use cases and requirements to > propose to the "plenary" working group. > > Let's try to use some of our time (including coffee breaks) to get some > action going in the sub-groups that are forming. > > kc > p.s. From my Dublin Core perspective I totally agree with your "simpler is > better" hypothesis. For DC I did some similar work as you propose: > * https://kcoyle.blogspot.com/2013/10/who-uses-dublin-core- > original-15.html > * https://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-dcterms.html > * https://kcoyle.blogspot.com/2013/10/dublin-core-usage-in-lod.html > > That last one seems to prove your point. > > On 7/12/17 3:17 PM, Jaroslav Pullmann wrote: > > > > Dear Karen, dear all > > > > within an introductory part we may want to summarize the Status quo. > Some questions to look at in a session 1) were: > > > > - How is DCAT currently used? > > - Do data portals, catalogs etc. use DCAT core model or rely on a > specific application profile? > > - What is the average complexity of DCAT Dataset descriptions? > > - Are some partitions of the vocabulary seldomly used or not at all > (e.g. CatalogRecord, theme)? - This will inform about deprecation candidates > > - What user/service interfaces and search options are available in > order to exploit the DCAT potential? > > > > Would the above support a hypothesis, that the main and obvious > benefit of DCAT is its simplicity > > and brevity where, in the end, only a subset of the vocabulary is > used? If this is the case, does > > it imply to be conservative in extending the core model and add only > few, well argumented properties > > while substantially extending guidance on using the existing ones? > > > > Afterwards I suggest to sort and discuss the UCs according to a > > user/task-oriented perspective, i.e. how would the various > > stakeholders make use of the DCAT concepts in order to perform their > > task (describe/publish/find/retrieve a data set etc.) > > > > *Catalog* > > > > Some motivating questions: > > > > - How are the Catalogs found at all, e.g. using Web search engines > that evaluate RDFa metadata? > > - How does DCAT support a data publisher to identify an appropriate > (specialized) Catalog to publish her data? > > - E.g. are Concept schemes available/used as a means of annotation > and browsing? > > > > Session 2) > > > > - ID40: Discoverability by mainstream search engines > > - ID35: Datasets and catalogues > > - ID25: Distribution and synchronization of catalog information > > > > - Identification of missing UC with regard to Catalog > > > > *DataSet* > > > > Some motivating questions: > > > > - Is there a general guidance on (minimal) amount of detail for > describing a data set? > > - What are the search strategies to look for a data set? > > - How does temporal/spatial, keyword or theme annotation help data > consumers to localise relevant data sets given a particular information > need? > > > > Session 3 +4) > > > > Annotation properties, description and documentation > > - ID33: Summarization/Characterization of datasets > > > > Dataset concept analysis > > - ID8: Scope or type of dataset with a DCAT description > > - ID20: Modelling resources different from datasets > > - ID36: Cross-vocabulary relationships (comparision of Dataset > > concepts) > > > > Dataset (& Distribution) versioning > > - ID6: Dataset Versioning Information > > > > Dataset co-relation and organization > > - ID32:Relationships between Datasets > > > > Dataset semantics > > - ID7: Support associating fine-grained semantics for datasets and > > resources within a dataset > > > > Session 5 + 6) > > > > Data quality, precision and accuracy > > - ID15: Modeling data precision and accuracy > > - ID16: Modeling conformance test results on data quality > > - ID14: Data quality modeling patterns > > - ID23: Data Quality Vocabulary > > > > Scope and context > > - ID28: Modeling reference systems > > - ID29: Modeling spatial coverage > > - ID38: Time-related aspects > > - ID27: Modeling temporal coverage > > > > Provenance, actors and obligations > > - ID12: Modeling data lineage > > - ID13: Modeling agent roles > > - ID31: Modeling funding sources > > > > Usage control > > - ID17: Data access restrictions > > > > - Identification of missing UC with regard to Dataset > > > > Tuesday > > > > Session 7) > > > > *Distribution* > > > > Some motivating questions: > > - Are there distribution patterns evident from 1)? > > - How is an interactive Distribution endpoint to be described to > enable human or service-based interaction? > > - Should we consider PUB/SUB protocols like MQTT? > > > > - ID1: DCAT packaged distributions > > > > Interactive, dynamic access > > - ID6: DCAT Distribution to describe web services > > - ID18: Modeling service-based data access > > - ID21: Machine actionable link for a mapping client > > - ID22: Template link in metadata > > > > - ID34: Relationships between Distributions of a Dataset > > > > - Identification of missing UC with regard to Distributon > > > > Session 8 + 9) > > > > *DCAT Profiles* > > > > - ID10: Requirements for data citation > > - ID10: Common requirements for scientific data > > - ID24: Harmonising INSPIRE-obligations and DCAT-distribution > > - ID37: Europeana profile ecosystem > > > > *Profile querying and negotiation* > > > > - ID2: Specifying media type interpretation beyond the content type > > - ID3: Combining multiple types of content sections in a single > response > > - ID5: Discover available content profiles > > - ID30: Standard APIs for metadata profile negotiation > > > > - Identification of missing UC with regard to Profile handling > > > > Session 10) > > > > *Metamodel* > > - ID11: Modeling identifiers and making them actionable > > - ID19: Guidance on the use of qualified forms > > - ID26: Extension points to 3rd party vocabularies > > > > - Identification of missing UC with regard to meta-modeling, > methodology etc. > > > > Given the overall time frame of 12h I assumed approx. 60min + 10 min > break per session. > > > > Best regards > > Jaroslav > > > > > > > > > > On Thursday, July 6, 2017 21:13 CEST, Karen Coyle <kcoyle@kcoyle.net> > wrote: > > > >> In preparing for the face-to-face, Caroline and I would like to ask > >> the group, especially the UCR editors, to suggest what they see as > >> the logical groupings for our discussion sessions. It would be ideal > >> for us to have this by the end of the working day (European time) on > Tuesday. > >> > >> We have eight 90-minute slots that we can make use of. If we assume > >> that at least part of the first slot will be introductions and > >> establishing an overall working hypothesis, then we have 7 slots in > >> which to discuss actual use cases. We may also wish to reserve 30 > >> minutes at the end of the second day to prepare a list of missing use > >> cases and immediate tasks relating to this deliverable. > >> > >> Remember that the primary goal of the F2F meeting is to provide the > >> UCR editors with the information and decisions that they need to > >> create a First Public Working Draft of the Use Cases and > >> Requirements. A FPWD is a "heart-beat" document that is not expected > >> to be final but that gives the W3C management and community an > >> indication of the direction of the group, as well as proof that it is > >> indeed getting its work done. We will expect the UCR to be issued in > >> additional versions as the work progresses. Our goal for the FPWD is > >> to meet the August 9 W3C deadline for publishing documents, which > >> means that the group needs to approve the document before that. > >> > >> Also, it would be good to have by the end of the Oxford meeting an > >> idea of how the DCAT group will proceed once the UCR FPWD is in > >> place. We > > > >> should also determine if the work so far informs the Profile and > >> Content Negotiation groups, or if we have more to do in gathering use > >> cases in those areas. > >> -- > >> Karen Coyle > >> kcoyle@kcoyle.net http://kcoyle.net > >> m: 1-510-435-8234 (Signal) > >> skype: kcoylenet/+1-510-984-3600 > >> > > > > > > > > > > -- > Karen Coyle > kcoyle@kcoyle.net http://kcoyle.net > m: 1-510-435-8234 (Signal) > skype: kcoylenet/+1-510-984-3600 > > > -- ---------------------------------------------------------------------------- Riccardo Albertoni Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico Magenes" Consiglio Nazionale delle Ricerche via de Marini 6 - 16149 GENOVA - ITALIA tel. +39-010-6475624 - fax +39-010-6475660 e-mail: Riccardo.Albertoni@ge.imati.cnr.it Skype: callto://riccardoalbertoni/ LinkedIn: http://www.linkedin.com/in/riccardoalbertoni www: *http://www.imati.cnr.it/ <http://www.imati.cnr.it/>* http://pers.ge.imati.cnr.it/albertoni/PersonalPage/albertoni.html FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
Received on Thursday, 13 July 2017 10:26:51 UTC