W3C home > Mailing lists > Public > public-bioschemas@w3.org > February 2018

Re: DataCatalog/Dataset, BioChemEntity and profiles

From: Justin Clark-Casey <jc955@cam.ac.uk>
Date: Mon, 5 Feb 2018 14:47:42 +0000
To: public-bioschemas@w3.org
Message-ID: <5c679e7d-bf1b-6c73-fd99-e9f724169978@cam.ac.uk>
On 02/02/18 15:56, LJ Garcia Castro wrote:
> Hello Sarala, all,
> I am going through our UniProt examples and I did not find a way to say that UniProt is mainly about a BioChemEntity profiled as Protein but also includes 
> mentions of a BioChemEntity profiled as Gene.
> We have the keywords, and there in plain text I can have something like "protein, protein annotation" and so on.
> Now, thinking about crawling and so, if we do not clearly state what is the kind of BioChemEntity a resource is supporting, how are we going to get all 
> resources providing Protein or Sample? This applies for both DataCatalog and Dataset. >
> In Record, not the type but the profile, we recommend using mainEntity as the way to link to the BioChemEntity. We could use mainEntity to specify the type of 
> the main entity supported by a resource or we could suggest a new property mainEntityType. We still would have to find a way to list the secondary entities (if 
> we see that is useful/desirable as well).

For crawling, I think it would be useful to have the DataCatalog/Dataset indicate the BioChemEntity types that it indexes with mainEntity.additionalType or 
similar.  If this info is present then it will be easier for a search engine to present relevant datasets to a user.  The fallback is to work it out from all 
the BioChemEntity but this requires a full crawl.

On another note, is there an up to date graph showing BioChemEntity and its properties?  The older one for PhysicalEntity was very useful to me.

Also, is there a final document/set of examples showing how to use the ontology extension mechanism for specifying BioChemEntity properties instead of the 
previous additionalProperty approach?


Justin Clark-Casey
Research Software Engineer, InterMine life sciences data integration, U of Cambridge
http://twitter.com/justincc http://justincc.org
Received on Monday, 5 February 2018 14:48:09 UTC

This archive was generated by hypermail 2.3.1 : Monday, 5 February 2018 14:48:09 UTC