- From: Justin Clark-Casey <justinccdev@gmail.com>
- Date: Wed, 24 May 2017 09:55:53 +0100
- To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
- Cc: "public-bioschemas@w3.org" <public-bioschemas@w3.org>, "Rafael C. Jimenez" <rafael.jimenez@elixir-europe.org>
- Message-ID: <CAME9NR_7S+9pnOB+decyCQOOubrVkVxiPo0DEDCPv2oxfjf_WQ@mail.gmail.com>
Sure, I'd be happy to. On 24 May 2017 at 09:33, Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk> wrote: > Hi Justin > > Thanks for doing this and pushing forward everyone's understanding. > > I know this is really short notice but could you present the highlights of > your work this afternoon? > > Alasdair J G Gray > Fellow of the Higher Education Academy > Assistant Professor in Computer Science > Herriot-Watt University, Edinburgh > > www.macs.hw.ac.uk/~ajg33 <http://%3Cbr/%3Ewww.macs.hw.ac.uk/~ajg33> > > ------------------------------ > *From:* Justin Clark-Casey <jc955@cam.ac.uk> > *Sent:* Tuesday, May 16, 2017 3:54:15 PM > *To:* public-bioschemas@w3.org > *Subject:* Very rough prototype implementation of DataCatalog/Dataset > schema.org markup in InterMine > > Hi all. In advance of the Bioschemas meeting next week, I've hacked up a > very rough implementation of schema.org markup in InterMine [1]. > Specifically, this > is in an installation of InterMine called Synbiomine [2], a data warehouse > for synthetic biology that I've been working on. This compiles information > from many > sources (EBI, NCBI, etc.) into integrated biological object reports > (genes, proteins, parts, etc.). > > In lieu of of 'proper' Bioschemas structures, I've put in DataCatalog and > Dataset. In fact, I'm abusing Dataset to represent integrated objects > (e.g. protein > Q816S6_BACCR) but I wanted to experiment with linking structures (in this > case DataCatalog and Dataset). The front page embeds the DataCatalog and > individual > report pages (e.g. [3]) embed Dataset. You can see the Google Structured > Data Testing Tool (GSDTT) analysis of the front page at [4] and a > particular report > pages at [5]. > > My top 5 immediate observations: > > * Embedding JSON-LD itself is not hard. More challenging is interpreting > which schema.org properties to use and how to use them (e.g. > CreativeWork.about or > Thing.description)? > > * Being able to link DataCatalog and Dataset (via dataset and > includedInDataCatalog attributes) feels like a big win to embed > standardized structure in a > website. In my case, however, I have 2m+ 'datasets' and this may cause > issues embedding in a single DataCatalog structure (in my implementation > I've > artificially limited this to 500). This may be due to my abuse of Dataset > but the same problem could crop up in other contexts. > > * Also in linking DataCatalog and Dataset, I am just embedding the Dataset > url in the DataCatalog, for instance, and assuming software will navigate > to the > Dataset and extract more information from that page. > > * The GSDTT is essential for checking the markup and having some > implementation for Bioschemas specifications will be very useful. > > * The GSDTT for some reason does not show multiple entries for the same > property (e.g. shows only one citation in [5] even though there are many). > I presume > this is just a GSDTT limitation. > > Overall, imo, it feels really nice to embed structured bio information > directly in the website and this could be really valuable if all the markup > is > consistent. Tooling here like GSDTT may be a big help. > > [1] http://intermine.org/ > [2] http://beta.synbiomine.org/synbiomine/begin.do > [3] http://beta.synbiomine.org/synbiomine/report.do?id=112968868 > [4] https://search.google.com/structured-data/testing-tool# > url=http%3A%2F%2Fbeta.synbiomine.org%2Fsynbiomine%2Fbegin.do > [5] https://search.google.com/structured-data/testing-tool# > url=http%3A%2F%2Fbeta.synbiomine.org%2Fsynbiomine% > 2Freport.do%3Fid%3D112968868 > > Regards, > > -- > Justin Clark-Casey, Synbiomine/InterMine Developer > http://synbiomine.org > http://twitter.com/justincc > > > ------------------------------ > > Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With > campuses and students across the entire globe we span the world, delivering > innovation and educational excellence in business, engineering, design and > the physical, social and life sciences. > > This email is generated from the Heriot-Watt University Group, which > includes: > > 1. Heriot-Watt University, a Scottish charity registered under number > SC000278 > 2. Edinburgh Business School a Charity Registered in Scotland, > SC026900. Edinburgh Business School is a company limited by guarantee, > registered in Scotland with registered number SC173556 and registered > office at Heriot-Watt University Finance Office, Riccarton, Currie, > Midlothian, EH14 4AS > 3. Heriot- Watt Services Limited (Oriam), Scotland's national > performance centre for sport. Heriot-Watt Services Limited is a private > limited company registered is Scotland with registered number SC271030 and > registered office at Research & Enterprise Services Heriot-Watt University, > Riccarton, Edinburgh, EH14 4AS. > > The contents (including any attachments) are confidential. If you are not > the intended recipient of this e-mail, any disclosure, copying, > distribution or use of its contents is strictly prohibited, and you should > please notify the sender immediately and then delete it (including any > attachments) from your system. >
Received on Wednesday, 24 May 2017 08:57:46 UTC