W3C home > Mailing lists > Public > public-bioschemas@w3.org > September 2018

Re: DataRecord and Dataset Search

From: ljgarcia <ljgarcia@ebi.ac.uk>
Date: Mon, 10 Sep 2018 13:35:58 +0100
To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
Cc: Dan Brickley <danbri@google.com>, public-bioschemas@w3.org, Natasha Noy <noy@google.com>, Vicki Tardif Holland <vtardif@google.com>
Message-ID: <9ab4d21deb0682603c008a0f96afc8ad@ebi.ac.uk>
Hi Alasdair,

I sounds to me you have covered it all. Maybe just some more information 
about how we link sdo:Dataset, bs:DataRecord and bs:BioChemEntity. 
sdo:Dataset sdo:hasPart bs:DataRecord (DataRecord actually extends from 
Dataset) and then sdo:DataRecord sdo:isPartOf sdo:Dataset. A 
sdo:DataRecord has sdo:maiEntity bs:BioChemEntity and then a 
bs:BioChemEntity is sdo:mainEntityOfPage of a sdo:DataRecord.

DataRecord include two additional properties:
* sdo:additionalProperty because we want everybody to be able to add 
no-named properties as needed
* bs:seeAlso so ther can be links to related data records in other 
datasets, this one is very important in Life Sciences.

Note: I am using sdo for schema.org and bs for bioschemas, although 
bioschemas types along with their properties should go to schema.org at 
some point (hopefully soon).

Regards,

On 2018-09-09 19:03, Gray, Alasdair J G wrote:
> Hi Dan
> 
> In the life sciences datasets, the individual records tend to get
> their own web page, i.e. each concept in the database would have its
> own page. The idea for the DataRecord is to be able declare that the
> page about a concept is part of a Dataset.
> 
> I believe the approach is agnostic to the underlying storage, i.e. the
> page could be generated from a relational database which pulls data
> about the concept from multiple tables, a triplestore, or some other
> form of database. It is more about the granularity of this being about
> a single concept, e.g. row in a relational database with its foreign
> keys.
> 
> Leyla, Rafa, Susanna, what do you think? Have I characterised this
> correctly or are there things in Dan’s email that I am missing.
> 
> Alasdair
> 
>> On 7 Sep 2018, at 18:12, Dan Brickley <danbri@google.com> wrote:
>> 
>> (+Natasha Noy, +Vicki Tardif Holland)
>> 
>> On Fri, 7 Sep 2018 at 15:54, Gray, Alasdair J G
>> <A.J.G.Gray@hw.ac.uk> wrote:
>> 
>>> Hi Dan,
>>> 
>>> Great to see the announcement this week about the Google Dataset
>>> search. Here is a link to a blog post for anyone who has not seen
>>> it yet
>>> 
>> 
> https://www.blog.google/products/search/making-it-easier-discover-datasets/
>>> 
>>> 
>>> Within Bioschemas, we have been building up a profile usage of
>>> DataCatalog containing Dataset(s) which themselves contain
>>> DataRecords. A DataRecord is something that we would be proposing
>>> as an addition to schema.org [1]. The idea is that a DataRecord is
>>> contained within a Dataset and would specify the types of entity
>>> that the record is about, e.g. Protein.
>>> http://bioschemas.org/types/DataRecord/specification/
>>> 
>>> We would like to understand whether DataRecord is an idea to which
>>> the schema.org [1] community would be receptive. An alternative
>>> approach would be to use Dataset for both records within a Dataset
>>> and the Dataset itself.
>> 
>> It is certainly a direction worth exploring and discussing.
>> 
>> One issue to think through (and I think I raised this at a
>> bioschemas f2f last year) is that "Dataset" is a very broad notion.
>> Some but not all datasets are tabular for example. And tabular (e.g.
>> csv, sql) structures have non-trivial mappings to "entity"-oriented
>> and "record"-oriented representations. Other formats will have
>> different (and possibly simpler) ideas about "records". Thinking
>> about tabular first, there are complex mapping languages like D2RQ
>> or https://www.w3.org/TR/r2rml/ and the RDF graph it generates
>> versus a rows-as-records view, how would your draft design deal with
>> multi-table datasets?
>> 
>> Nearby in this world are specs like W3C CSVW, Data Cube, ... lots of
>> overlaps. It would be great to work through some examples in
>> detail...
>> 
>> Dan
>> 
>>> Thanks
>>> 
>>> Alasdair
>>> 
>>> --
>>> Alasdair J G Gray
>>> Associate Professor in Computer Science,
>>> School of Mathematical and Computer Sciences
>>> Heriot-Watt University, Edinburgh, UK.
>>> 
>>> Email: A.J.G.Gray@hw.ac.uk
>>> Web: http://www.macs.hw.ac.uk/~ajg33
>>> ORCID: http://orcid.org/0000-0002-5711-4872
>>> Office: Earl Mountbatten Building 1.39
>>> Twitter: @gray_alasdair
>>> 
>>> -------------------------
>>> 
>>> _HERIOT-WATT UNIVERSITY IS THE TIMES & THE SUNDAY TIMES
>>> INTERNATIONAL UNIVERSITY OF THE YEAR 2018_
>>> 
>>> Founded in 1821, Heriot-Watt is a leader in ideas and solutions.
>>> With campuses and students across the entire globe we span the
>>> world, delivering innovation and educational excellence in
>>> business, engineering, design and the physical, social and life
>>> sciences.
>>> 
>>> This email is generated from the Heriot-Watt University Group,
>>> which includes:
>>> 
>>> * Heriot-Watt University, a Scottish charity registered under
>>> number SC000278
>>> * Edinburgh Business School a Charity Registered in Scotland,
>>> SC026900. Edinburgh Business School is a company limited by
>>> guarantee, registered in Scotland with registered number SC173556
>>> and registered office at Heriot-Watt University Finance Office,
>>> Riccarton, Currie, Midlothian, EH14 4AS
>>> * Heriot- Watt Services Limited (Oriam), Scotland's national
>>> performance centre for sport. Heriot-Watt Services Limited is a
>>> private limited company registered is Scotland with registered
>>> number SC271030 and registered office at Research & Enterprise
>>> Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
>>> 
>>> The contents (including any attachments) are confidential. If you
>>> are not the intended recipient of this e-mail, any disclosure,
>>> copying, distribution or use of its contents is strictly
>>> prohibited, and you should please notify the sender immediately
>>> and then delete it (including any attachments) from your system.
> 
>  --
>  Alasdair J G Gray
>  Associate Professor in Computer Science,
> School of Mathematical and Computer Sciences
> Heriot-Watt University, Edinburgh, UK.
> 
> Email: A.J.G.Gray@hw.ac.uk
> Web: http://www.macs.hw.ac.uk/~ajg33
> ORCID: http://orcid.org/0000-0002-5711-4872
> Office: Earl Mountbatten Building 1.39
> Twitter: @gray_alasdair
> 
> 
> 
> Links:
> ------
> [1] http://schema.org/
Received on Monday, 10 September 2018 12:36:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:08:06 UTC