Re: Organizing Use Cases for F2F / Agenda proposal from Karen Coyle on 2017-07-12 (public-dxwg-wg@w3.org from July 2017)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Wed, 12 Jul 2017 15:44:44 -0700
To: Jaroslav Pullmann <jaroslav.pullmann@fit.fraunhofer.de>
Cc: "public-dxwg-wg@w3.org" <public-dxwg-wg@w3.org>
Message-ID: <8cc8ca81-8b25-e546-c319-90548455c46d@kcoyle.net>
Thanks, Jaroslav,

You may have seen that Caroline and I have put together an agenda for
the meeting. Unfortunately I begin traveling tomorrow so she and I will
probably not meet again before Monday, but we will compare your list
with ours and we can always make adjustments as we discuss during the
F2F. I think that the categorizing of the use cases into groups is very
useful, and seeing them from different points of view also helps.

As for the introductory part that you propose, some of that may be best
suited to the DCAT 1.1 sub-group, which hopefully will begin meeting
soon. It does seem logical that a first step would be to survey the uses
of DCAT and the DCAT-related APs that exist today. I would definitely
encourage the sub-group to get started on such an analysis as soon as
possible. That group, of course, can also come back with new use cases
and requirements to propose to the "plenary" working group.

Let's try to use some of our time (including coffee breaks) to get some
action going in the sub-groups that are forming.

kc
p.s. From my Dublin Core perspective I totally agree with your "simpler
is better" hypothesis. For DC I did some similar work as you propose:
* https://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-original-15.html
* https://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-dcterms.html
* https://kcoyle.blogspot.com/2013/10/dublin-core-usage-in-lod.html

That last one seems to prove your point.

On 7/12/17 3:17 PM, Jaroslav Pullmann wrote:
> 
>   Dear Karen, dear all
> 
>   within an introductory part we may want to summarize the Status quo. Some questions to look at in a session 1) were: 
>     
>      - How is DCAT currently used?
>      - Do data portals, catalogs etc. use DCAT core model or rely on a specific application profile?
>      - What is the average complexity of DCAT Dataset descriptions?
>      - Are some partitions of the vocabulary seldomly used or not at all (e.g. CatalogRecord, theme)? - This will inform about deprecation candidates
>      - What user/service interfaces and search options are available in order to exploit the DCAT potential?
> 
>   Would the above support a hypothesis, that the main and obvious benefit of DCAT is its simplicity
>   and brevity where, in the end, only a subset of the vocabulary is used? If this is the case, does
>   it imply to be conservative in extending the core model and add only few, well argumented properties
>   while substantially extending guidance on using the existing ones?
> 
>  Afterwards I suggest to sort and discuss the UCs according to a user/task-oriented perspective, i.e. how would the various 
>  stakeholders make use of the DCAT concepts in order to perform their task (describe/publish/find/retrieve a data set etc.)
> 
>  *Catalog* 
> 
>    Some motivating questions:
> 
>     - How are the Catalogs found at all, e.g. using Web search engines that evaluate RDFa metadata?
>     - How does DCAT support a data publisher to identify an appropriate (specialized) Catalog to publish her data?
>     - E.g. are Concept schemes available/used as a means of annotation and browsing?
> 
>    Session 2)  
>   
>    - ID40: Discoverability by mainstream search engines
>    - ID35: Datasets and catalogues
>    - ID25: Distribution and synchronization of catalog information
> 
>    - Identification of missing UC with regard to Catalog
> 
>  *DataSet*
> 
>    Some motivating questions:
> 
>     - Is there a general guidance on (minimal) amount of detail for describing a data set?  
>     - What are the search strategies to look for a data set?
>     - How does temporal/spatial, keyword or theme annotation help data consumers to localise relevant data sets given a particular information need?
> 
>   Session 3 +4)
> 
>    Annotation properties, description and documentation
>    - ID33: Summarization/Characterization of datasets
> 
>   Dataset concept analysis
>   - ID8: Scope or type of dataset with a DCAT description
>   - ID20: Modelling resources different from datasets
>   - ID36: Cross-vocabulary relationships (comparision of Dataset concepts)
> 
>   Dataset (& Distribution) versioning 
>   - ID6: Dataset Versioning Information
> 
>   Dataset co-relation and organization
>   - ID32:Relationships between Datasets
> 
>   Dataset semantics
>   - ID7: Support associating fine-grained semantics for datasets and resources within a dataset
> 
>   Session 5 + 6) 
> 
>   Data quality, precision and accuracy
>   - ID15: Modeling data precision and accuracy
>   - ID16: Modeling conformance test results on data quality
>   - ID14: Data quality modeling patterns
>   - ID23: Data Quality Vocabulary
> 
>   Scope and context
>   - ID28: Modeling reference systems
>   - ID29: Modeling spatial coverage
>   - ID38: Time-related aspects
>   - ID27: Modeling temporal coverage
>   
>   Provenance, actors and obligations
>    - ID12: Modeling data lineage
>    - ID13: Modeling agent roles
>    - ID31: Modeling funding sources
>  
>    Usage control
>    - ID17: Data access restrictions
>   
>    - Identification of missing UC with regard to Dataset
>     
>  Tuesday
> 
>   Session 7)
> 
>   *Distribution*
>     
>    Some motivating questions:
>    - Are there distribution patterns evident from 1)?
>    - How is an interactive Distribution endpoint to be described to enable human or service-based interaction?
>    - Should we consider PUB/SUB protocols like MQTT?
> 
>   - ID1: DCAT packaged distributions
> 
>   Interactive, dynamic access
>   - ID6: DCAT Distribution to describe web services
>   - ID18: Modeling service-based data access
>   - ID21: Machine actionable link for a mapping client
>   - ID22: Template link in metadata
> 
>   - ID34: Relationships between Distributions of a Dataset
> 
>   - Identification of missing UC with regard to Distributon
> 
>  Session 8 + 9)
> 
>  *DCAT Profiles*
> 
>   - ID10: Requirements for data citation
>   - ID10: Common requirements for scientific data
>   - ID24: Harmonising INSPIRE-obligations and DCAT-distribution
>   - ID37: Europeana profile ecosystem
> 
>  *Profile querying and negotiation*
> 
>   - ID2: Specifying media type interpretation beyond the content type
>   - ID3: Combining multiple types of content sections in a single response
>   - ID5: Discover available content profiles
>   - ID30: Standard APIs for metadata profile negotiation
> 
>   - Identification of missing UC with regard to Profile handling
> 
>  Session 10)
> 
>  *Metamodel*
>   - ID11: Modeling identifiers and making them actionable
>   - ID19: Guidance on the use of qualified forms
>   - ID26: Extension points to 3rd party vocabularies
> 
>   - Identification of missing UC with regard to meta-modeling, methodology etc.  
> 
>    Given the overall time frame of 12h I assumed approx. 60min + 10 min break per session. 
> 
>      Best regards 
>    Jaroslav
> 
> 
> 
>  
> On Thursday, July 6, 2017 21:13 CEST, Karen Coyle <kcoyle@kcoyle.net> wrote: 
>  
>> In preparing for the face-to-face, Caroline and I would like to ask the
>> group, especially the UCR editors, to suggest what they see as the
>> logical groupings for our discussion sessions. It would be ideal for us
>> to have this by the end of the working day (European time) on Tuesday.
>>
>> We have eight 90-minute slots that we can make use of. If we assume that
>> at least part of the first slot will be introductions and establishing
>> an overall working hypothesis, then we have 7 slots in which to discuss
>> actual use cases. We may also wish to reserve 30 minutes at the end of
>> the second day to prepare a list of missing use cases and immediate
>> tasks relating to this deliverable.
>>
>> Remember that the primary goal of the F2F meeting is to provide the UCR
>> editors with the information and decisions that they need to create a
>> First Public Working Draft of the Use Cases and Requirements. A FPWD is
>> a "heart-beat" document that is not expected to be final but that gives
>> the W3C management and community an indication of the direction of the
>> group, as well as proof that it is indeed getting its work done. We will
>> expect the UCR to be issued in additional versions as the work
>> progresses. Our goal for the FPWD is to meet the August 9 W3C deadline
>> for publishing documents, which means that the group needs to approve
>> the document before that.
>>
>> Also, it would be good to have by the end of the Oxford meeting an idea
>> of how the DCAT group will proceed once the UCR FPWD is in place. We
> 
>> should also determine if the work so far informs the Profile and Content
>> Negotiation groups, or if we have more to do in gathering use cases in
>> those areas.
>> -- 
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> m: 1-510-435-8234 (Signal)
>> skype: kcoylenet/+1-510-984-3600
>>
>  
>  
>  
> 

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234 (Signal)
skype: kcoylenet/+1-510-984-3600
Received on Wednesday, 12 July 2017 22:45:17 UTC