- From: Lin MD, Simon <LINMD.SIMON@mcrf.mfldclin.edu>
- Date: Thu, 31 May 2012 19:30:29 +0000
- To: Matthias Samwald <matthias.samwald@meduniwien.ac.at>, Aaron Brown <abbrown@google.com>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
- CC: Dan Brickley <danbri@danbri.org>, Konstantin Pentchev <konstantin.pentchev@ontotext.com>, Allan Hanbury <hanbury@ifs.tuwien.ac.at>
- Message-ID: <A5BD550FC766564987DA2A59779890300D4FD2B6@MCL-EXMB02.mfldclin.org>
Thank is great, Matthias! I am not sure if the following has already been done. If not, it is of great interest to make is available in RDF. This FDA list will link drugs to genes of interest in pharmacogenomics. http://www.fda.gov/Drugs/ScienceResearch/ResearchAreas/Pharmacogenetics/ucm083378.htm Best regards, Simon ================================================== Simon Lin, MD Director, Biomedical Informatics Research Center Marshfield Clinic Research Foundation 1000 N Oak Ave, Marshfield, WI 54449 Office 715-221-7299 Lin.Simon@mcrf.mfldclin.edu www.marshfieldclinic.org/birc For scheduling assistance, please contact Crystal Gumz, Administrative Secretary gumz.crystal@mcrf.mfldclin.edu 715-221-6403 From: Matthias Samwald [mailto:matthias.samwald@meduniwien.ac.at] Sent: Thursday, May 31, 2012 9:28 AM To: Aaron Brown; public-semweb-lifesci@w3.org Cc: Dan Brickley; Konstantin Pentchev; Allan Hanbury Subject: Re: RDF Schema / LODD mapping -- Re: New proposal: health & medical extensions to schema.org Dear all, I have finished a first conversion of some key datasets from the "Linked Open Drug Data" collection to schema.org with medical extensions. At the moment, I converted the datasets from Drugbank [1] and Dailymed [2]. I can work on mapping other datasets such as RxNorm, DBpedia and ClinicalTrials.gov as well, if this pilot leads to promising results. The RDF of the conversion is available at http://samwald.info/res/medical-schema-org/pharmaceutical-information-according-to-schema-org.ttl Beware that this file is quite large (33 MB). I have published it uncompressed so that it is more transparent to web crawlers. How this was done To create this file, I extracted the RDF triples from the RDFa file provided by Aaron (outcome available at [3]). I had to fix a minor bug to make that happen correctly (there was whitespace in some of the labels and URIs). Then I manually created a mapping file between the schema.org extensions and the entities and properties used in the Drugbank and Dailymed datasets. This mapping is based on RDF Schema and Simple SPARQL Rules, and is available at [4] -- please have a look. I loaded all these files together with the LODD datasets into a triplestore with RDFS reasoning and executed the SPARQL Rules, yielding the final pharmaceutical-information-according-to-schema-org.ttl file. Where to go from here It would be great to evaluate how Google and other search engines (such as Khresmoi [5]) can use structured information based on schema.org to improve access to medical / pharmaceutical information. To do this, we could set up web sites based on these datasets with embedded Microdata (or RDFa lite?) statements. Then we could compare the usability of schema.org-aware search engines with standard search engines (e.g., a normal Google Custom Search Engine). I think this could provide a very impressive example of what schema.org markup enables (and probably a nice scientific article). @ Aaron: What do you suggest as the next steps for setting up such a test scenario? Are there any prototypical search tools from Google on the horizon that we could use? @ Aaron: If you want to get some more detailed feedback from me about the schema.org extensions and some modelling choices, we should probably get in contact via Skype. @ All: Do you have any suggestions for automatically publishing RDF datasets as HTML-with-Microdata or HTML-with-RDFa? Or do we need to write a script from scratch? [1] http://drugbank.ca/ [2] http://dailymed.nlm.nih.gov/dailymed/ [3] http://samwald.info/res/medical-schema-org/schema_org_rdfa.ttl [4] http://samwald.info/res/medical-schema-org/schema_org_2_LODD_mapping.ttl [5] http://khresmoi.eu/ Best, Matthias From: Matthias Samwald<mailto:matthias.samwald@meduniwien.ac.at> Sent: Monday, May 21, 2012 1:35 PM To: Aaron Brown<mailto:abbrown@google.com> Cc: Dan Brickley<mailto:danbri@danbri.org> ; public-semweb-lifesci@w3.org<mailto:public-semweb-lifesci@w3.org> Subject: RDF Schema / LODD mapping -- Re: New proposal: health & medical extensions to schema.org Dear Aaron, I think it might be an interesting exercise to publish some of the "Linked Open Drug Data" [1] datasets as microdata that adheres to the proposed extensions. These datasets were published in RDF format by members of the W3C Health Care and Life Science Interest Group. Mapping these datasets to your proposed schema.org extensions would be much easier if we had an RDF Schema of those extensions (which is available for the official schema.org via [2] and [3]). Could you make an RDF schema of your extensions available? [1] http://www.w3.org/wiki/HCLSIG/LODD/Data [2] http://schema.org/docs/schemaorg.owl [3] http://schema.rdfs.org/all.ttl Cheers, Matthias Samwald From: Michel Dumontier<mailto:michel.dumontier@gmail.com> Sent: Wednesday, May 16, 2012 4:05 PM To: w3c semweb hcls<mailto:public-semweb-lifesci@w3.org> Cc: Aaron Brown<mailto:abbrown@google.com> ; Dan Brickley<mailto:danbri@danbri.org> Subject: New proposal: health & medical extensions to schema.org Hi all, Aaron Brown (@google) and others have been working on a health/medical extension to schema.org<http://schema.org> -> http://schemaorg-medicalext.appspot.com/. It's also linked on the W3 wiki at http://www.w3.org/wiki/WebSchemas/MedicalHealthProposal, along with other proposals - http://www.w3.org/wiki/WebSchemas. Have a look at the medical/health proposal and tell us what you think - I'd love to hear from those that are active in creating or consuming web page content (SciDisc, atags, Mark Wilkinson's Personal Health Lens, etc). Reserve Friday June 1 @ 11am (Terminology task force slot) for a special meeting discuss the proposal and we'll craft some feedback for the public mailing list at public-vocabs@w3.org<mailto:public-vocabs@w3.org>. Cheers! m. ---------- Forwarded message ---------- From: Dan Brickley <danbri@danbri.org<mailto:danbri@danbri.org>> Date: Tue, May 15, 2012 at 10:49 AM Subject: Fwd: New proposal: health & medical extensions to schema.org<http://schema.org> To: eric@w3.org<mailto:eric@w3.org>, team-hcls-chairs@w3.org<mailto:team-hcls-chairs@w3.org>, Aaron Brown <abbrown@google.com<mailto:abbrown@google.com>> Cc: ivan@w3.org<mailto:ivan@w3.org> Eric, HCLS folk, Ivan, I want to introduce you to Aaron Brown, and pass along his msg below introducing some work on health/medical markup for use in the public Web, part of the schema.org<http://schema.org> project which is a collaboration amongst several search engines to improve structured data usage within HTML. Aaron has been busy with a pretty substantial medical/health vocabulary, and yesterday circulated a first public version for feedback/comments. I wanted to ask your advice on how best we might connect this with the various activities of the HCLS W3C group. The message below is public (see http://lists.w3.org/Archives/Public/public-vocabs/2012May/0057.html ), so we could just pass it along to the public HCLS list http://lists.w3.org/Archives/Public/public-semweb-lifesci/ but if you've any thoughts on how best to interact with HCLS that would be really useful. The emphasis with the vocabulary Aaron's working on is on in-page HTML markup rather than full/deep ontology engineering, though there are obviously points of connection to such activities. I'll leave Aaron to discuss the details (see his note below or ask in this thread). Thanks for any advice, cheers, Dan ps. for a bit more background - The public-vocabs@w3.org<mailto:public-vocabs@w3.org> list is the main feedback/discussion forum for the schema.org<http://schema.org> initiative. Within W3C it is the 'Web Schemas' taskforce of the Semantic Web group, which I chair. I also btw have an @google affiliation for my schema.org<http://schema.org> work, though I don't formally represent Google at W3C. Basically the Web Schemas group serves as a liaison point between schema.org<http://schema.org> as an external entity, the W3C community, and other groups producing metadata vocabularies. More details c/o http://www.w3.org/wiki/WebSchemas ... ---------- Forwarded message ---------- From: Aaron Brown <abbrown@google.com<mailto:abbrown@google.com>> Date: 14 May 2012 22:56 Subject: New proposal: health & medical extensions to schema.org<http://schema.org> To: public-vocabs@w3.org<mailto:public-vocabs@w3.org> Hi all, As I've alluded to before on this list (http://lists.w3.org/Archives/Public/public-vocabs/2012Feb/0053.html), over the past 6 months, a few of us at Google and other institutions have been working on a set of schema.org<http://schema.org> extensions to cover the health and medical domain. After several internal iterations and a lot of feedback from initial reviewers (including the US NCBI; physicians at Harvard, Stanford, and Duke; the major search engines; and a few health web sites), we think we have a solid draft and would like to open it for public feedback as a step toward incorporating it into schema.org<http://schema.org>. The proposed health/medical schema can be found at http://schemaorg-medicalext.appspot.com/ which includes an introduction as well as a snapshot of the type hierarchy and several markup examples. It's also linked on the w3 wiki at http://www.w3.org/wiki/WebSchemas/MedicalHealthProposal. As you'll see this is a substantial piece of work, so we'd welcome feedback and detailed review comments on the specifics (please follow up to this email). For those interested in more background on the approach: our goal is to create schema that webmasters and content publishers can use to mark up health and medical content on the web, with a particular focus on markup that will help patients, physicians, and generally health-interested consumers find relevant health information via search. The scope of coverage for the schema is broad, and is intended to cover both consumer- and professionally-targeted health and medical web content (of course, any particular piece of online health/medical content is likely to use only a subset of the schema). We've worked with physicians, consumer web sites, and government health organizations to get input into the key topics and properties to model and to refine the schema structure and type/property documentation. Note that it is explicitly not our goal to replace the many very good and comprehensive medical ontologies, meta-thesaurii, or controlled vocabularies that have been created over the years; our focus has been instead on creating complementary, lightweight markup that surfaces the existence of and relationships between entities in health/medical web pages. When other ontologies and/or controlled vocabularies are available, our proposed schema can link to and take advantage of them, e.g. via the code property of MedicalEntity. It is also not an initial goal to support automated reasoning, medical records coding, or genomic tagging, as these would require substantially more detailed (and hence high barrier-to-entry) modeling and markup; they could be considered for future extensions. We look forward to your feedback! Thanks, Aaron Brown (Google) -- Aaron Brown | Senior Product Manager | Google, Inc. | New York, NY ______________________________________________________________________ The contents of this message may contain private, protected and/or privileged information. If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within. Please contact the sender and advise of the erroneous delivery by return e-mail or telephone. Thank you for your cooperation.
Received on Thursday, 31 May 2012 19:31:40 UTC