- From: Leyla Garcia <ljgarcia@ebi.ac.uk>
- Date: Mon, 23 Oct 2017 16:52:16 +0100
- To: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
- Message-ID: <d0e40ff4-682c-6108-0823-a71c99b559db@ebi.ac.uk>
Hi all, I attach here a ShEx schema to validate a protein entity. This is what I want a protein to have: * preferrelLabel: "Protein", * additionalType: at least one URL * identifier: at least one Text or URL or PropertyValue * name: at least one Text * isContainedIn: at least one URL or BioChemEntity * additionalProperty: at least one for the transcribed gene * additionalProperty: zero or more of any other kind A complaint protein is available in ProteinEntity.ttl. So far, with ShEx I have managed to require: * preferrelLabel: "Protein", * additionalType: at least one IRI * identifier: at least one Text or URL or PropertyValue or string or IRI * name: at least one Text or string * isContainedIn: at least one IRI or blank node * additionalProperty: at least one of transcribed gene or IRI or blank node Some constraints are still missing and I would appreciate any help with them. Not sure if they can be expressed in ShEx: * Any object of isContainedIn should be a BioChemEntity (but if all I have is an IRI, could/should I add that restriction?) * At least one additionalProperty for transcribed gene is mandatory, any other is optional. Something like ( ( schema:additionalProperty @my:TranscribedFromGene)+ | ( schema:additionalProperty IRI | schema:additionalProperty BNODE)* ) By the way, what I have so far is valid according to http://rawgit.com/shexSpec/shex.js/master/doc/shex-simple.html. I just tried it for one node [1] (sorry, could not make it shorter). Regards, [1] http://rawgit.com/shexSpec/shex.js/master/doc/shex-simple.html?schema=PREFIX rdf%3A <http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23>%0APREFIX schema%3A <http%3A%2F%2Fschema.org%2F>%0APREFIX xsd%3A <http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23>%0APREFIX my%3A <http%3A%2F%2Fmy.example%2F%23>%0A%0Aschema%3ABioChemEntity {%0A schema%3ApreferredLabel ["Protein"] %3B%0A schema%3AadditionalType IRI%2B %3B%0A%0A (%0A schema%3Aidentifier xsd%3Astring |%0A schema%3Aidentifier IRI |%0A schema%3Aidentifier schema%3APropertyValue |%0A schema%3Aidentifier schema%3AText |%0A schema%3Aidentifier schema%3AURL%0A )%2B %3B%0A (%0A schema%3Aname xsd%3Astring |%0A schema%3Aname schema%3AText%0A )%2B %3B%0A%0A (%0A schema%3AisContainedIn IRI |%0A schema%3AisContainedIn BNODE%0A )%2B %3B%0A%0A (%0A schema%3AadditionalProperty %40my%3ATranscribedFromGene |%0A schema%3AadditionalProperty IRI |%0A schema%3AadditionalProperty BNODE%0A )%2B%0A}%0A%0Amy%3ATranscribedFromGene {%0A schema%3AadditionalType IRI%2B %3B%0A schema%3Aname ["gene"] %3B%0A schema%3Avalue %40my%3AGene%0A}%0A%0Amy%3AGene {%0A schema%3ApreferredLabel ["Gene"] %3B%0A schema%3AadditionalType IRI%2B %3B%0A}&data=%40prefix rdf%3A <http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23> .%0A%40prefix schema%3A <http%3A%2F%2Fschema.org%2F> .%0A%40prefix xsd%3A <http%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23> .%0A%0A<http%3A%2F%2Fwww.uniprot.org%2Funiprot%2FP00519>%0A a schema%3ABioChemEntity %3B%0A schema%3ApreferredLabel "Protein" %3B%0A schema%3AadditionalType <http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010043> %3B%0A%0A schema%3Aidentifier "P00519" %3B%0A schema%3Aname "ABL1" %3B%0A%0A schema%3AisContainedIn <http%3A%2F%2Fwww.identifiers.org%2Ftaxon%3A9606> %3B%0A%0A schema%3AadditionalProperty [%0A a schema%3APropertyValue %3B%0A schema%3AadditionalType <http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010081> %3B%0A schema%3Aname "gene" %3B%0A schema%3Avalue [%0A a schema%3AStructuredValue%2C schema%3ABioChemEntity %3B%0A schema%3ApreferredLabel "Gene" %3B%0A schema%3AadditionalType <http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010035> %3B%0A schema%3Aidentifier "ABL1" %3B%0A schema%3Aname "ABL1"%0A ]%0A ]%2C [%0A a schema%3APropertyValue %3B%0A schema%3AadditionalType <http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_000983> %3B%0A schema%3Aname "disease association" %3B%0A schema%3Avalue [%0A a schema%3AStructuredValue%2C schema%3AMedicalCondition %3B%0A schema%3AadditionalType <http%3A%2F%2Fsemanticscience.org%2Fresource%2FSIO_010299> %3B%0A schema%3Acode [%0A a schema%3AMedicalCode %3B%0A schema%3Acode "608232" %3B%0A schema%3AcodingSystem "OMIM"%0A ] %3B%0A schema%3Aname "Leukemia%2C chronic myeloid (CML)" %3B%0A schema%3AsameAs <http%3A%2F%2Fwww.uniprot.org%2Fdiseases%2FDI-03735>%0A ]%0A ]%0A.%0A%0A<http%3A%2F%2Fwww.identifiers.org%2Ftaxon%3A9606>%0A a schema%3ABioChemEntity %3B%0A schema%3Aidentifier "9606" %3B%0A schema%3Aname "Homo sapiens" %3B%0A schema%3AsameAs <http%3A%2F%2Fpurl.uniprot.org%2Ftaxonomy%2F9606> %3B%0A schema%3Aurl <http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FNCBITAXON%2F9606> .&shape-map=<http%3A%2F%2Fwww.uniprot.org%2Funiprot%2FP00519>%40schema%3ABioChemEntity&interface=human®expEngine=threaded-val-nerr
Attachments
- text/plain attachment: schema.shex
- text/plain attachment: ProteinEntity.ttl
Received on Monday, 23 October 2017 15:52:46 UTC