- From: Matthias Samwald <matthias.samwald@meduniwien.ac.at>
- Date: Thu, 31 May 2012 16:27:44 +0200
- To: "Aaron Brown" <abbrown@google.com>, <public-semweb-lifesci@w3.org>
- Cc: "Dan Brickley" <danbri@danbri.org>, "Konstantin Pentchev" <konstantin.pentchev@ontotext.com>, "Allan Hanbury" <hanbury@ifs.tuwien.ac.at>
- Message-ID: <74B09E60986A412588C4323ADE18602B@zetsu>
Dear all, I have finished a first conversion of some key datasets from the "Linked Open Drug Data" collection to schema.org with medical extensions. At the moment, I converted the datasets from Drugbank [1] and Dailymed [2]. I can work on mapping other datasets such as RxNorm, DBpedia and ClinicalTrials.gov as well, if this pilot leads to promising results. The RDF of the conversion is available at http://samwald.info/res/medical-schema-org/pharmaceutical-information-according-to-schema-org.ttl Beware that this file is quite large (33 MB). I have published it uncompressed so that it is more transparent to web crawlers. How this was done To create this file, I extracted the RDF triples from the RDFa file provided by Aaron (outcome available at [3]). I had to fix a minor bug to make that happen correctly (there was whitespace in some of the labels and URIs). Then I manually created a mapping file between the schema.org extensions and the entities and properties used in the Drugbank and Dailymed datasets. This mapping is based on RDF Schema and Simple SPARQL Rules, and is available at [4] -- please have a look. I loaded all these files together with the LODD datasets into a triplestore with RDFS reasoning and executed the SPARQL Rules, yielding the final pharmaceutical-information-according-to-schema-org.ttl file. Where to go from here It would be great to evaluate how Google and other search engines (such as Khresmoi [5]) can use structured information based on schema.org to improve access to medical / pharmaceutical information. To do this, we could set up web sites based on these datasets with embedded Microdata (or RDFa lite?) statements. Then we could compare the usability of schema.org-aware search engines with standard search engines (e.g., a normal Google Custom Search Engine). I think this could provide a very impressive example of what schema.org markup enables (and probably a nice scientific article). @ Aaron: What do you suggest as the next steps for setting up such a test scenario? Are there any prototypical search tools from Google on the horizon that we could use? @ Aaron: If you want to get some more detailed feedback from me about the schema.org extensions and some modelling choices, we should probably get in contact via Skype. @ All: Do you have any suggestions for automatically publishing RDF datasets as HTML-with-Microdata or HTML-with-RDFa? Or do we need to write a script from scratch? [1] http://drugbank.ca/ [2] http://dailymed.nlm.nih.gov/dailymed/ [3] http://samwald.info/res/medical-schema-org/schema_org_rdfa.ttl [4] http://samwald.info/res/medical-schema-org/schema_org_2_LODD_mapping.ttl [5] http://khresmoi.eu/ Best, Matthias From: Matthias Samwald Sent: Monday, May 21, 2012 1:35 PM To: Aaron Brown Cc: Dan Brickley ; public-semweb-lifesci@w3.org Subject: RDF Schema / LODD mapping -- Re: New proposal: health & medical extensions to schema.org Dear Aaron, I think it might be an interesting exercise to publish some of the "Linked Open Drug Data" [1] datasets as microdata that adheres to the proposed extensions. These datasets were published in RDF format by members of the W3C Health Care and Life Science Interest Group. Mapping these datasets to your proposed schema.org extensions would be much easier if we had an RDF Schema of those extensions (which is available for the official schema.org via [2] and [3]). Could you make an RDF schema of your extensions available? [1] http://www.w3.org/wiki/HCLSIG/LODD/Data [2] http://schema.org/docs/schemaorg.owl [3] http://schema.rdfs.org/all.ttl Cheers, Matthias Samwald From: Michel Dumontier Sent: Wednesday, May 16, 2012 4:05 PM To: w3c semweb hcls Cc: Aaron Brown ; Dan Brickley Subject: New proposal: health & medical extensions to schema.org Hi all, Aaron Brown (@google) and others have been working on a health/medical extension to schema.org -> http://schemaorg-medicalext.appspot.com/. It's also linked on the W3 wiki at http://www.w3.org/wiki/WebSchemas/MedicalHealthProposal, along with other proposals - http://www.w3.org/wiki/WebSchemas. Have a look at the medical/health proposal and tell us what you think - I'd love to hear from those that are active in creating or consuming web page content (SciDisc, atags, Mark Wilkinson's Personal Health Lens, etc). Reserve Friday June 1 @ 11am (Terminology task force slot) for a special meeting discuss the proposal and we'll craft some feedback for the public mailing list at public-vocabs@w3.org. Cheers! m. ---------- Forwarded message ---------- From: Dan Brickley <danbri@danbri.org> Date: Tue, May 15, 2012 at 10:49 AM Subject: Fwd: New proposal: health & medical extensions to schema.org To: eric@w3.org, team-hcls-chairs@w3.org, Aaron Brown <abbrown@google.com> Cc: ivan@w3.org Eric, HCLS folk, Ivan, I want to introduce you to Aaron Brown, and pass along his msg below introducing some work on health/medical markup for use in the public Web, part of the schema.org project which is a collaboration amongst several search engines to improve structured data usage within HTML. Aaron has been busy with a pretty substantial medical/health vocabulary, and yesterday circulated a first public version for feedback/comments. I wanted to ask your advice on how best we might connect this with the various activities of the HCLS W3C group. The message below is public (see http://lists.w3.org/Archives/Public/public-vocabs/2012May/0057.html ), so we could just pass it along to the public HCLS list http://lists.w3.org/Archives/Public/public-semweb-lifesci/ but if you've any thoughts on how best to interact with HCLS that would be really useful. The emphasis with the vocabulary Aaron's working on is on in-page HTML markup rather than full/deep ontology engineering, though there are obviously points of connection to such activities. I'll leave Aaron to discuss the details (see his note below or ask in this thread). Thanks for any advice, cheers, Dan ps. for a bit more background - The public-vocabs@w3.org list is the main feedback/discussion forum for the schema.org initiative. Within W3C it is the 'Web Schemas' taskforce of the Semantic Web group, which I chair. I also btw have an @google affiliation for my schema.org work, though I don't formally represent Google at W3C. Basically the Web Schemas group serves as a liaison point between schema.org as an external entity, the W3C community, and other groups producing metadata vocabularies. More details c/o http://www.w3.org/wiki/WebSchemas ... ---------- Forwarded message ---------- From: Aaron Brown <abbrown@google.com> Date: 14 May 2012 22:56 Subject: New proposal: health & medical extensions to schema.org To: public-vocabs@w3.org Hi all, As I’ve alluded to before on this list (http://lists.w3.org/Archives/Public/public-vocabs/2012Feb/0053.html), over the past 6 months, a few of us at Google and other institutions have been working on a set of schema.org extensions to cover the health and medical domain. After several internal iterations and a lot of feedback from initial reviewers (including the US NCBI; physicians at Harvard, Stanford, and Duke; the major search engines; and a few health web sites), we think we have a solid draft and would like to open it for public feedback as a step toward incorporating it into schema.org. The proposed health/medical schema can be found at http://schemaorg-medicalext.appspot.com/ which includes an introduction as well as a snapshot of the type hierarchy and several markup examples. It's also linked on the w3 wiki at http://www.w3.org/wiki/WebSchemas/MedicalHealthProposal. As you'll see this is a substantial piece of work, so we’d welcome feedback and detailed review comments on the specifics (please follow up to this email). For those interested in more background on the approach: our goal is to create schema that webmasters and content publishers can use to mark up health and medical content on the web, with a particular focus on markup that will help patients, physicians, and generally health-interested consumers find relevant health information via search. The scope of coverage for the schema is broad, and is intended to cover both consumer- and professionally-targeted health and medical web content (of course, any particular piece of online health/medical content is likely to use only a subset of the schema). We’ve worked with physicians, consumer web sites, and government health organizations to get input into the key topics and properties to model and to refine the schema structure and type/property documentation. Note that it is explicitly not our goal to replace the many very good and comprehensive medical ontologies, meta-thesaurii, or controlled vocabularies that have been created over the years; our focus has been instead on creating complementary, lightweight markup that surfaces the existence of and relationships between entities in health/medical web pages. When other ontologies and/or controlled vocabularies are available, our proposed schema can link to and take advantage of them, e.g. via the code property of MedicalEntity. It is also not an initial goal to support automated reasoning, medical records coding, or genomic tagging, as these would require substantially more detailed (and hence high barrier-to-entry) modeling and markup; they could be considered for future extensions. We look forward to your feedback! Thanks, Aaron Brown (Google) -- Aaron Brown | Senior Product Manager | Google, Inc. | New York, NY
Received on Thursday, 31 May 2012 14:28:24 UTC