- From: Michel Dumontier <michel.dumontier@gmail.com>
- Date: Tue, 4 Jun 2013 15:40:16 +0200
- To: Alasdair J G Gray <Alasdair.Gray@manchester.ac.uk>
- Cc: Jerven Bolleman <me@jerven.eu>, Joachim Baran <joachim.baran@gmail.com>, N Juty <juty@ebi.ac.uk>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
- Message-ID: <CALcEXf6u7xSFsa=stCA7+7Wh3K8B=yEJh1Gz+ix7Gxo6P2hf+w@mail.gmail.com>
On Tue, Jun 4, 2013 at 3:39 PM, Alasdair J G Gray < Alasdair.Gray@manchester.ac.uk> wrote: > > On 4 Jun 2013, at 14:20, Jerven Bolleman <me@jerven.eu> wrote: > > > > > On Tue, Jun 4, 2013 at 3:08 PM, Michel Dumontier < > michel.dumontier@gmail.com> wrote: > >> The point here is simple. if you provide a URI uniprot:1.2.3.4, i would >> like to know that this is incorrect. >> >> m. >> > Yes, but the model needs to be good enough to tell you that. The model > discussed yesterday with > data item identifer regex pattern is not strong enough to do so. The void > uriRegexPattern might be good enough. > > :x a void:Dataset ; > void:uriRegexPattern "ec:[1-6].\d.\d.\d" , "uniprot:P\d{5}" . > > But I am thinking that we can have stronger validation patterns if we > think a bit more. > e.g. can we think of something that can prevent. > > uniprot:P12345 a up:Sequence . > sequence:P12345 a up:Protein . > > Of course, the prefixes here are syntactic shortcuts for the full URI, so > you would be able to distinguish these if you encode the complete URI and > not just the identifier part. > (Not meaning to sound like a broken record ;) ) > > And is a dataset description the right place for this validation data? > > So for the Bio2RDF/Identifiers.org use case yes. However, it may be most > appropriate for a service such as Identifiers.org to extend dataset > descriptions provided by publishers with this sort of information. > > if original data publishers provided this information, we wouldn't have to curate it. (sorry nick!) m. > Alasdair > > Regards, > Jerven > >> >> >> On Tue, Jun 4, 2013 at 3:01 PM, Joachim Baran <joachim.baran@gmail.com>wrote: >> >>> >>> On 4 June 2013 08:56, Jerven Bolleman <me@jerven.eu> wrote: >>> >>>> uniprot:P12345 a up:Protein ; >>>> >>>>> up:enzyme ec:1.2.3.4 . >>>>>> ec:1.2.3.4 a up:Enzyme . >>>>>> What if my data is >>>>>> >>>>> >>>> uniprot:1.2.3.4 a up:Protein ; >>>> up:enzyme ec:P12345 . >>>> ec:P12345 a up:Enzyme . >>>> >>> I do not understand the new example. You just switched the identifiers? >>> >>> >>>> What if I don't have a regular expression for one of the sets? >>>> >>> I suggest it implies the set of all URIs, i.e. the regexp: .* >>> >>> >>>> Or two very similar ones? >>>> e.g. mgi and pubmed? >>>> >>> Take the union regexp. >>> >>> Joachim >>> >>> >> >> >> -- >> Michel Dumontier >> Associate Professor of Bioinformatics, Carleton University >> Chair, W3C Semantic Web for Health Care and the Life Sciences Interest >> Group >> http://dumontierlab.com >> > > > > -- > Jerven Bolleman > me@jerven.eu > > > Dr Alasdair J G Gray > Research Associate > Alasdair.Gray@manchester.ac.uk > +44 161 275 0145 > > http://www.cs.man.ac.uk/~graya/ > > Please consider the environment before printing this email. > > -- Michel Dumontier Associate Professor of Bioinformatics, Carleton University Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group http://dumontierlab.com
Received on Tuesday, 4 June 2013 13:41:08 UTC