- From: Leyla Garcia <ljgarcia@ebi.ac.uk>
- Date: Fri, 22 Sep 2017 16:39:30 +0100
- To: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
- Message-ID: <0d8b6f3c-2d09-8a75-10a8-1bccaf1a7e2d@ebi.ac.uk>
Dear all,
We presented our BiologicalEntity idea in a poster during ISMB last July
and we got some mixed feedback. Although the idea was a good one, the
specification had some problems. Particularly, the specification was not
flexible enough and the properties seem to be a bit random. Flexible and
extensible schemata is key for Bioschemas so during the past week, four
of us have been working on changes to BiologicalEntity. We have already
shared these changes with attendees to the BioHackathon and this mailing
list as well. We have got some good comments. In this email I just
extend what we already shared (meaning this is a long mail).
* BiologicalEntity does not exist anymore, it has been replaced by
PhysicalEntity and Record
* PhysicalEntity follows the flexibility initially proposed by the
Samples group by reusing the additionalProperty property
* LabProtocol is simplified thanks to a change proposed for CreativeWork
* PhysicalEntity and Record would be customized by profiles such as
Protein, Sample and so on. Properties such as additionalProperty,
isContainedIn, contains are expected to be customized by profiles
The graph below shows the key points but if you want more detail you can
keep reading.
*Summary of PhysicalEntity*
* PhysicalEntity extends from Thing and reuses
**additionalType in order to specify whether it is a protein,
sample, phenotype, etc. Ontology terms should be used to point to the
corresponding concept (minimum)
** identifier (minimum)
** mainEntityOfPage to link to the corresponding Record on a Dataset
** sameAs to point to any webpage defining this entity
** url to point to the official webpage
** alternateName, description, image are used as described by
schema.org.
* PhysicalEntity has as own properties
** additionalProperty so any other property can be added.
additionalType of the property should be used to better specify the
nature of the property, name/description should be use as a label or so
for the property, value should be used to point the actual range of this
property
** isContainedIn,
** contains,
** location
** hasRepresentation to point to representations other than a
Record or an image, for instance it could be a text corresponding to a
sequence
*Summary of Record*
* Record extends from Dataset and reuses
** distribution so we can point to a downloadable version of the Record
* Record has as own properties
** additionalProperty that follows the same guidelines as for
PhysicalEntity
** seeAlso to link to any related Thing whenever the relation is
not so clear by we know it exists (usually cross-references)
*Example of PhysicalEntity customization done for the Protein case
*
* additionalType. minimum, many. Recommended type will probably be
"http://semanticscience.org/resource/SIO_010043"
* alternateName. optional. For UniProt it would look like ["ABL, "ABL1"]
* description. recommended. For Uniprot it would be the protein function
* identifier. minimum. For UniProt it would look like "P00519"
* image. optional. Probably not used yet by UniProt
* mainEntityOfPage. optional. Probably not used by UniProt, we would
link from the Record to the PhysicalEntity as it works better for us
* name. recommended. For UniProt it would look like "Tyrosine-protein
kinase ABL1"
* sameAs. optional.
* url. recommend. For UniProt it would look like
"http://www.uniprot.org/uniprot/P00519"
* additionalProperty. optional
* additionalProperty/disease-association. recommended.
** additionalType for property probably
"http://semanticscience.org/resource/SIO_000983",
** name for property "disease association",
** value types StructuredValue and MedicalCondition,
** additionalType for value probably
"http://semanticscience.org/resource/SIO_010299"
** the rest of the properties depend on what the source can
actually provide, for instance disease name, disease url, medical code, etc.
* additionalProperty/transcribed-gene. minimum.
** additionalType for property probably
"http://semanticscience.org/resource/SIO_010081",
** name for property "gene",
** value types StructuredValue and PhysicalEntity,
** additionalType for value probably
"http://semanticscience.org/resource/SIO_010035"
** the rest of the properties depend on what the source can
actually provide, for instance disease name
* isContainedIn. optional
* isContainedIn/organism. minimum.
** type would be PhysicalEntity
** additionalType would probably be
"http://semanticscience.org/resource/SIO_010000"
** identifier would be taxon ID
** url could be a link to NCBI taxon
** sameAs could be a link to UniProt taxonomy
* location. optional. Probably not used for proteins but for protein
annotations it could be a FALDO position
* hasRepresentation. optional. For instance the protein sequence
*Example of Record customization done for the Protein case *
* distribution. optional. For UniProt links to FASTA, text, XML, RDF files
* additionalType. optional. For UniProt probably
"http://purl.uniprot.org/core/Protein"
* seeAlso. optional
* identifier. minimum. For UniProt it would be like "P00519"
* url. recommend/optional? For UniProt it would look like
"http://www.uniprot.org/uniprot/P00519"
* mainEntity. recommended. For UniProt, all the PhysicalEntity/Protein
information will go here
* citation, dateCreated, dateModified, datePublished, hasPart,
isBasedOn, isBasisFor, isPartOf, keywords, license. optional. Used as
needed and depending on the information actually provided by the Dataset
containing this record.
Regards,
Received on Friday, 22 September 2017 15:39:56 UTC