Re: How to relate an entity to a dataset?

Thanks all for your inputs. The question is indeed tougher than I expected.

If I try to summarize the different options, following up on the use 
case of a web page providing data about a taxon.

About wether there is "a need for a separate DataRecord type, or whether 
we accept that everything we mark up is a web resource and therefore 
essentially a record by default", perhaps the first question we may want 
to answer is this one: what is the meaning of the top-level object in 
the JSON-LD markup? (if we use JSON-LD).
In my view, it may be (1) what the web page describes (a taxon), or (2) 
what the web page provides (some data describing a taxon).

Case (1): I can use bioschemas.org/datasetPartOf to relate the Taxon to 
the dataset. schema.org/isPartOf and isBasedOn are not appropriate here 
as they would entail that a Taxon is a CreativeWork, which is highly 
debatable. So my markup would denote something like:
     Taxon --datasetPartOf--> Dataset
This modeling is definitely awkward: a taxon can certainly not be 
reduced to a "part of a dataset". It is a biological concept about which 
datasets provide data.

Case (2):
     DataRecord --mainEntity--> Taxon
     DataRecord --isPartOf--> Dataset
Here I can use schema.org/isPartOf since DataRecord and Dataset are both 
subtypes of CreativeWork.
The DataRecord makes sense here since the web page represents some piece 
of data (record) that is about a taxon and that comes from a taxonomic 
registry (dataset).

More generally, the need for a DataRecord may depend on the type of 
resources described. Do we have use cases that would more naturally fit 
in the "direct" model?

Note that a 3rd proposition (cf. Leyla's email) is to denote the WebPage 
and link it to the resource described with schema:mainEntity. In that 
case the JSON-LD top-level object would describe the web resource itself 
that, in turn, would be related to what it describes or provides.
This could actually apply to all marked up web pages, whatever the 
content. It's possibly more correct semantically, but this is hardly the 
way schema.org is used today, I guess.

Franck.

Le 18/02/2019 à 11:44, Gray, Alasdair J G a écrit :
> Hi Franck
>
> This is a really good question and one that we as a community need to 
> come up with an answer for.
>
> The Bioschemas data group came up with the proposal for having a 
> DataRecord profile over the schema.org/Dataset 
> <http://schema.org/Dataset> type.
> http://bioschemas.org/specifications/DataRecord/
>
> However, discussions have led to this approach not being favoured 
> since it would distort searches for datasets
> https://lists.w3.org/Archives/Public/public-bioschemas/2018Oct/0002.html
> https://github.com/BioSchemas/specifications/issues/217
>
> Based on this, the data group then proposed a DataRecord type
> http://bioschemas.org/types/DataRecord/
>
> This has led to lots of discussions on whether the resources we are 
> marking up are representations of the concept themselves or the data 
> record about the concept. Currently no conclusion has come of this 
> discussion.
>
> Looking at the examples for the DataRecord proposal, they proposed to 
> use schema.org/mainEntityOfPage 
> <http://schema.org/mainEntityOfPage> to connect the data record to the 
> concept that it was discussing.
> https://github.com/BioSchemas/specifications/tree/master/DataRecord/examples/
>
> In the examples, there are a variety of different ways of connecting 
> the record to the dataset including schema.org/isBasedOn 
> <http://schema.org/isBasedOn> (UniProt example), schema.org/isPartOf 
> <http://schema.org/isPartOf> (Biosamples and PDBe examples), and 
> bioschemas.org/datasetPartOf 
> <http://bioschemas.org/datasetPartOf> (Biosamples example). Both of 
> the schema.org <http://schema.org> properties would require that all 
> concepts are of type CreativeWork which is not compatible with the 
> proposal to extend schema.org <http://schema.org> with life sciences 
> types which extend from schema.org/Thing <http://schema.org/Thing>.
> http://bio.sdo-bioschemas-227516.appspot.com/
>
> It is clear that there is a need to connect the markup of a resource 
> to the dataset that contains the resource. The main focus of the 
> debate has been on whether there is a need for a separate DataRecord 
> type, or whether you accept that everything we mark up is a web 
> resource and therefore essentially a record by default. Either way, we 
> as a community need to agree on a common way to connect some markup 
> about a concept to the markup of a dataset containing that concept. 
> However, the answer to this is likely to depend on whether we need to 
> have a DataRecord or not.
>
> I would invite others to give their opinions on this and would like us 
> to consider finally closing this issue of whether there is a need for 
> a DataRecord type or not.
>
> Best regards
>
> Alasdair
>
>> On 14 Feb 2019, at 20:53, Franck Michel <fmichel@i3s.unice.fr 
>> <mailto:fmichel@i3s.unice.fr>> wrote:
>>
>> Dear all,
>>
>> In the biodiversity group we have defined a Taxon type. Taxa are 
>> usually part of taxonomic registries. So I'm wondering how to mark up 
>> a web page describing or referring to a given taxon while denoting 
>> that this taxon is part of a taxonomic registry.
>>
>> Looking further in Bioschemas types, I see several ways of doing 
>> this, andI guess this question applies equally to other types.
>>
>> We can think of a taxonomic registry as aschema.org/Dataset 
>> <http://schema.org/Dataset>. In this context, each taxon would be a 
>> DataRecord, but how do we relate a DataRecord to its Dataset?
>>
>> We can also think of ataxonomic registryas a DataCatalog wherein taxa 
>> would be Datasets (property includedInDataCatalog).
>>
>> Could the datasets group on shed light on this? Also what about how 
>> the other groups have coped with this?
>>
>> Thx,
>>    Franck.
>>
>> --
>>
>>  Franck MICHEL- CNRS research engineer
>> Université Côte d’Azur, CNRS, Inria
>> I3S laboratory (UMR 7271)
>> franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr>- +33 (0)4 8915 4277 
>>
>
> --
> Alasdair J G Gray
> Associate Professor in Computer Science,
> School of Mathematical and Computer Sciences
> Heriot-Watt University, Edinburgh, UK.
>
> Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
> Web: http://www.macs.hw.ac.uk/~ajg33
> ORCID: http://orcid.org/0000-0002-5711-4872
> Office: Earl Mountbatten Building 1.39
> Twitter: @gray_alasdair
>
> To arrange a meeting: http://doodle.com/ajggray
>
> Untitled Document
> ------------------------------------------------------------------------
>
> */Heriot-Watt University is The Times & The Sunday Times International 
> University of the Year 2018/*
>
> Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With 
> campuses and students across the entire globe we span the world, 
> delivering innovation and educational excellence in business, 
> engineering, design and the physical, social and life sciences. This 
> email is generated from the Heriot-Watt University Group, which includes:
>
>  1. Heriot-Watt University, a Scottish charity registered under number
>     SC000278
>  2. Edinburgh Business School a Charity Registered in Scotland,
>     SC026900. Edinburgh Business School is a company limited by
>     guarantee, registered in Scotland with registered number SC173556
>     and registered office at Heriot-Watt University Finance Office,
>     Riccarton, Currie, Midlothian, EH14 4AS
>  3. Heriot- Watt Services Limited (Oriam), Scotland's national
>     performance centre for sport. Heriot-Watt Services Limited is a
>     private limited company registered is Scotland with registered
>     number SC271030 and registered office at Research & Enterprise
>     Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
>
> The contents (including any attachments) are confidential. If you are 
> not the intended recipient of this e-mail, any disclosure, copying, 
> distribution or use of its contents is strictly prohibited, and you 
> should please notify the sender immediately and then delete it 
> (including any attachments) from your system.
>

Received on Monday, 18 February 2019 14:53:46 UTC