Re: DataCatalog/Dataset, BioChemEntity and profiles

On 02/02/18 15:56, LJ Garcia Castro wrote:
> Hello Sarala, all,
> 
> I am going through our UniProt examples and I did not find a way to say that UniProt is mainly about a BioChemEntity profiled as Protein but also includes 
> mentions of a BioChemEntity profiled as Gene.
> 
> We have the keywords, and there in plain text I can have something like "protein, protein annotation" and so on.
> 
> Now, thinking about crawling and so, if we do not clearly state what is the kind of BioChemEntity a resource is supporting, how are we going to get all 
> resources providing Protein or Sample? This applies for both DataCatalog and Dataset. >
> In Record, not the type but the profile, we recommend using mainEntity as the way to link to the BioChemEntity. We could use mainEntity to specify the type of 
> the main entity supported by a resource or we could suggest a new property mainEntityType. We still would have to find a way to list the secondary entities (if 
> we see that is useful/desirable as well).

For crawling, I think it would be useful to have the DataCatalog/Dataset indicate the BioChemEntity types that it indexes with mainEntity.additionalType or 
similar.  If this info is present then it will be easier for a search engine to present relevant datasets to a user.  The fallback is to work it out from all 
the BioChemEntity but this requires a full crawl.

On another note, is there an up to date graph showing BioChemEntity and its properties?  The older one for PhysicalEntity was very useful to me.

Also, is there a final document/set of examples showing how to use the ontology extension mechanism for specifying BioChemEntity properties instead of the 
previous additionalProperty approach?

Regards,

--
Justin Clark-Casey
Research Software Engineer, InterMine life sciences data integration, U of Cambridge
http://twitter.com/justincc http://justincc.org

Received on Monday, 5 February 2018 14:48:09 UTC