Re: Worry to many Datasets => spam Was [Re: {Disarmed} Re: DataRecord and Dataset Search]

Thank you L.J. I opened an issue. Feel free to modify it if necessary
to make more sense to bioschemas.


~~~~
Karen Yook

Curator
WormBase Caltech
Tel: 415.306.4150
e-mail: kyook@caltech.edu
e-mail: karen@wormbase.org
skype name: wbkaren

On Mon, Oct 1, 2018 at 1:50 AM LJ Garcia Castro <ljgarcia@ebi.ac.uk> wrote:
>
> Dear Karen,
>
> Currently, Bioschemas recommends BioChemEntity as the range for
> mainEntity. If StructuredValue works better for you, could you please
> open an issue at https://github.com/BioSchemas/specifications/issues ?
> Someone from Bioschemas will pick it up so this change is evaluated and
> later incorporated into the specifications. Please mention Alasdair and
> me in the issue so we get notified. Issues in GitHub help us keeping in
> mind actions that come from discussions in the mailing list.
>
> Kind regards,
>
>
> On 28/09/2018 19:57, Karen Yook wrote:
> > Hi Jerven,
> >
> > Thank you Jerven for suggesting the
> > "subtype to schema:StructuredValue e.g. bioschema:BioChemConcept"
> >
> > And raising the potential problem with sole use of 'Dataset' in the
> > current proposed tag set. As you point out for UniProt,  that is one
> > of the problems that would also affect Alliance pages. In addition,
> > 'Dataset' is just not a good description of our pages, rather they are
> > the living compilations of curation being created  from many
> > 'datasets', which range from large scale datasets to single bioentity
> > studies.
> >
> > Alasdair, if you need more specific examples of how 'Dataset' would be
> > less than ideal for us, let me know.  However, for now, I am happy
> > with what Jerven has proposed.  I will discuss this internally with
> > the Alliance to see if there are more specific things we need to
> > address.
> >
> > Best,
> > Karen
> >
> >
> >
> >
> > On Fri, Sep 28, 2018 at 1:37 AM Jerven Bolleman
> > <jerven.bolleman@sib.swiss> wrote:
> >> Hi Alasdair, All,
> >>
> >> Now that google dataset search exists I have a new worry of over using
> >> Dataset.
> >>
> >> Take www.uniprot.org as an example. It has a bit more than a billion
> >> webpages. Marking them all up with Dataset for what was a DataRecord
> >> before would mean we would have a bit over 3.5 billion Datasets.
> >> Google has no problem with dealing with the volume, but I am worried
> >> that their antispam logic/relevance would drown out the 7 or so Datasets
> >> that I would like to see highly ranked in their toolbox search.
> >>
> >> Considering that most of this work is SEO related, I would vote to mark
> >> up just 1 page with DataCatalog/Dataset on www.uniprot.org and not on
> >> the other pages.
> >>
> >> A more specific concept would be quite nice. May I suggest using a
> >> subtype to schema:StructuredValue e.g. bioschema:BioChemConcept.
> >> For example the schema:mainEntity on
> >> "https://wormbase.org/species/c_elegans/gene/WBGene00012939" would be of
> >> type schema:StructuredValue.
> >>
> >> In (hand-typed) JSON-LD roughly this.
> >>
> >> {
> >>     "@context" : "http://schema.org",
> >>     "@id" : "https://wormbase.org/species/c_elegans/gene/WBGene00012939" ,
> >>     "@type" : "Webpage" ,
> >>     "identifier" : "WBGene00012939",
> >>     "mainEntity" : {
> >>          "@type" : "StructuredValue" ,
> >>           "name"  : "subs-4" ,
> >>           "hasPart" : {
> >>                  "@type" : "PropertyValue" ,
> >>                  "propertyID" : "Sequence",
> >>                  "value" : "Y47D3B.1 "
> >>          }
> >>      }
> >> }
> >>
> >>
> >> Regards,
> >> Jerven
> >>
> >>
> >> On 09/28/2018 09:37 AM, Gray, Alasdair J G wrote:
> >>> Hi Karen,
> >>>
> >>>> On 27 Sep 2018, at 22:38, Karen Yook <karen@wormbase.org
> >>>> <mailto:karen@wormbase.org>> wrote:
> >>>>
> >>>> I just need to weigh in here as a voice in the Alliance of Genome
> >>>> Resources before anything gets finalized wrt to DataRecord or DataSet.
> >>>> While we are not tied to 'DataRecord' per se, we will need something
> >>>> other than just 'DataSet' to tag our pages.
> >>> Can you elaborate on what you mean by, “we will need something other
> >>> than just ‘DataSet’ to tag our pages”?
> >>>
> >>>> We also believe specific distinctions via sub-types perhaps seems to
> >>>> be the preferred way to do things by bothschemas.org
> >>>> <http://schemas.org/>and Google.  We
> >>>> will try to come up with a more specific proposal by or at the
> >>>> Biohackathon in Paris in a couple weeks.
> >>> We would like to get these issues resolved before the hackathon so that
> >>> we can have stable core profiles for use in marking up with resources.
> >>>
> >>> Thanks
> >>>
> >>> Alasdair
> >>>
> >>> --
> >>> Alasdair J G Gray
> >>> Associate Professor in Computer Science,
> >>> School of Mathematical and Computer Sciences
> >>> Heriot-Watt University, Edinburgh, UK.
> >>>
> >>> Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
> >>> Web: http://www.macs.hw.ac.uk/~ajg33
> >>> ORCID: http://orcid.org/0000-0002-5711-4872
> >>> Office: Earl Mountbatten Building 1.39
> >>> Twitter: @gray_alasdair
> >>>
> >>> Untitled Document
> >>> ------------------------------------------------------------------------
> >>>
> >>> */Heriot-Watt University is The Times & The Sunday Times International
> >>> University of the Year 2018/*
> >>>
> >>> Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With
> >>> campuses and students across the entire globe we span the world,
> >>> delivering innovation and educational excellence in business,
> >>> engineering, design and the physical, social and life sciences.
> >>>
> >>> This email is generated from the Heriot-Watt University Group, which
> >>> includes:
> >>>
> >>>   1. Heriot-Watt University, a Scottish charity registered under number
> >>>      SC000278
> >>>   2. Edinburgh Business School a Charity Registered in Scotland,
> >>>      SC026900. Edinburgh Business School is a company limited by
> >>>      guarantee, registered in Scotland with registered number SC173556
> >>>      and registered office at Heriot-Watt University Finance Office,
> >>>      Riccarton, Currie, Midlothian, EH14 4AS
> >>>   3. Heriot- Watt Services Limited (Oriam), Scotland's national
> >>>      performance centre for sport. Heriot-Watt Services Limited is a
> >>>      private limited company registered is Scotland with registered
> >>>      number SC271030 and registered office at Research & Enterprise
> >>>      Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
> >>>
> >>> The contents (including any attachments) are confidential. If you are
> >>> not the intended recipient of this e-mail, any disclosure, copying,
> >>> distribution or use of its contents is strictly prohibited, and you
> >>> should please notify the sender immediately and then delete it
> >>> (including any attachments) from your system.
> >>>
> >>
>

Received on Monday, 1 October 2018 15:54:12 UTC