- From: Leyla Garcia <ljgarcia@ebi.ac.uk>
- Date: Mon, 23 Oct 2017 16:52:16 +0100
- To: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
- Message-ID: <d0e40ff4-682c-6108-0823-a71c99b559db@ebi.ac.uk>
Hi all,
I attach here a ShEx schema to validate a protein entity. This is what I
want a protein to have:
* preferrelLabel: "Protein",
* additionalType: at least one URL
* identifier: at least one Text or URL or PropertyValue
* name: at least one Text
* isContainedIn: at least one URL or BioChemEntity
* additionalProperty: at least one for the transcribed gene
* additionalProperty: zero or more of any other kind
A complaint protein is available in ProteinEntity.ttl. So far, with ShEx
I have managed to require:
* preferrelLabel: "Protein",
* additionalType: at least one IRI
* identifier: at least one Text or URL or PropertyValue or string or IRI
* name: at least one Text or string
* isContainedIn: at least one IRI or blank node
* additionalProperty: at least one of transcribed gene or IRI or blank node
Some constraints are still missing and I would appreciate any help with
them. Not sure if they can be expressed in ShEx:
* Any object of isContainedIn should be a BioChemEntity (but if all I
have is an IRI, could/should I add that restriction?)
* At least one additionalProperty for transcribed gene is mandatory, any
other is optional. Something like
(
( schema:additionalProperty @my:TranscribedFromGene)+ |
( schema:additionalProperty IRI |
schema:additionalProperty BNODE)*
)
By the way, what I have so far is valid according to
http://rawgit.com/shexSpec/shex.js/master/doc/shex-simple.html. I just
tried it for one node [1] (sorry, could not make it shorter).
Regards,
[1]
http://rawgit.com/shexSpec/shex.js/master/doc/shex-simple.html?schema=PREFIX
rdf%3A
<http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23>%0APREFIX
schema%3A <http%3A%2F%2Fschema.org%2F>%0APREFIX xsd%3A
<http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23>%0APREFIX my%3A
<http%3A%2F%2Fmy.example%2F%23>%0A%0Aschema%3ABioChemEntity {%0A
schema%3ApreferredLabel ["Protein"] %3B%0A schema%3AadditionalType
IRI%2B %3B%0A%0A (%0A schema%3Aidentifier xsd%3Astring |%0A
schema%3Aidentifier IRI |%0A schema%3Aidentifier schema%3APropertyValue
|%0A schema%3Aidentifier schema%3AText |%0A schema%3Aidentifier
schema%3AURL%0A )%2B %3B%0A (%0A schema%3Aname xsd%3Astring |%0A
schema%3Aname schema%3AText%0A )%2B %3B%0A%0A (%0A
schema%3AisContainedIn IRI |%0A schema%3AisContainedIn BNODE%0A )%2B
%3B%0A%0A (%0A schema%3AadditionalProperty %40my%3ATranscribedFromGene
|%0A schema%3AadditionalProperty IRI |%0A schema%3AadditionalProperty
BNODE%0A )%2B%0A}%0A%0Amy%3ATranscribedFromGene {%0A
schema%3AadditionalType IRI%2B %3B%0A schema%3Aname ["gene"] %3B%0A
schema%3Avalue %40my%3AGene%0A}%0A%0Amy%3AGene {%0A
schema%3ApreferredLabel ["Gene"] %3B%0A schema%3AadditionalType IRI%2B
%3B%0A}&data=%40prefix rdf%3A
<http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23>
.%0A%40prefix schema%3A <http%3A%2F%2Fschema.org%2F> .%0A%40prefix
xsd%3A <http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23>
.%0A%0A<http%3A%2F%2Fwww.uniprot.org%2Funiprot%2FP00519>%0A a
schema%3ABioChemEntity %3B%0A schema%3ApreferredLabel "Protein" %3B%0A
schema%3AadditionalType
<http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010043> %3B%0A%0A
schema%3Aidentifier "P00519" %3B%0A schema%3Aname "ABL1" %3B%0A%0A
schema%3AisContainedIn <http%3A%2F%2Fwww.identifiers.org%2Ftaxon%3A9606>
%3B%0A%0A schema%3AadditionalProperty [%0A a schema%3APropertyValue
%3B%0A schema%3AadditionalType
<http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010081> %3B%0A
schema%3Aname "gene" %3B%0A schema%3Avalue [%0A a
schema%3AStructuredValue%2C schema%3ABioChemEntity %3B%0A
schema%3ApreferredLabel "Gene" %3B%0A schema%3AadditionalType
<http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010035> %3B%0A
schema%3Aidentifier "ABL1" %3B%0A schema%3Aname "ABL1"%0A ]%0A ]%2C [%0A
a schema%3APropertyValue %3B%0A schema%3AadditionalType
<http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_000983> %3B%0A
schema%3Aname "disease association" %3B%0A schema%3Avalue [%0A a
schema%3AStructuredValue%2C schema%3AMedicalCondition %3B%0A
schema%3AadditionalType
<http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010299> %3B%0A
schema%3Acode [%0A a schema%3AMedicalCode %3B%0A schema%3Acode "608232"
%3B%0A schema%3AcodingSystem "OMIM"%0A ] %3B%0A schema%3Aname
"Leukemia%2C chronic myeloid (CML)" %3B%0A schema%3AsameAs
<http%3A%2F%2Fwww.uniprot.org%2Fdiseases%2FDI-03735>%0A ]%0A
]%0A.%0A%0A<http%3A%2F%2Fwww.identifiers.org%2Ftaxon%3A9606>%0A a
schema%3ABioChemEntity %3B%0A schema%3Aidentifier "9606" %3B%0A
schema%3Aname "Homo sapiens" %3B%0A schema%3AsameAs
<http%3A%2F%2Fpurl.uniprot.org%2Ftaxonomy%2F9606> %3B%0A schema%3Aurl
<http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FNCBITAXON%2F9606>
.&shape-map=<http%3A%2F%2Fwww.uniprot.org%2Funiprot%2FP00519>%40schema%3ABioChemEntity&interface=human®expEngine=threaded-val-nerr
Attachments
- text/plain attachment: schema.shex
- text/plain attachment: ProteinEntity.ttl
Received on Monday, 23 October 2017 15:52:46 UTC