W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > January 2014

invited talk: SINA:Semantic Interpretation of User Queries for Question Answering on Interlinked Data

From: Michel Dumontier <michel.dumontier@gmail.com>
Date: Mon, 20 Jan 2014 05:25:09 -0800
Message-ID: <CALcEXf6h+Z0=k_UC-sSXPttCGOojdYF_sc3m4r=7pwoeWscaBw@mail.gmail.com>
To: w3c semweb hcls <public-semweb-lifesci@w3.org>
Cc: Saeedeh Shekarpour <sa.shekarpour@gmail.com>
Hi everybody,
  Please join us on Tuesday at 11am EDT for a talk by Saeedeh Shekarpour.

Dial-In #:     +1.617.761.6200 (Cambridge, MA)
VoIP address:  sip:zakim@voip.w3.org
Access Code: 4257 ("HCLS")
IRC Channel:   irc.w3.org port 6665 channel #HCLS

*Title: *SINA:Semantic Interpretation of User Queries for Question
Answering on Interlinked Data

*Abstract*:  The  Data  Web  contains  a  wealth  of  knowledge  on  a
 large  number  of  domains. Question answering over interlinked data
 sources  is  challenging  due  to   two   inherent  characteristics.
First,   different  datasets  employ  heterogeneous  schemas  and each one
 may  only  contain  a  part   of   the answer  for  a   certain  question.
 Second,  constructing   a federated  formal  query across different
datasets  requires   exploiting  links   between   the  different
datasets   on   both  the   schema   and instance  levels.  We  present  a
 question answering  system,  which  transforms  user  supplied  queries
(i.e.  natural   language  sentences  or  keywords)   into  conjunctive
SPARQL queries over a set of interlinked data sources. The contribution of
this work is as follows:
1. A   novel  approach  for  determining  the  most  suitable   resources
 for  a  user­supplied  query  from   different datasets  (disambiguation).
 We   employ   a  hidden   Markov   model,   whose  parameters  were
 bootstrapped  with   different distribution functions.
2. A  novel   method   for   constructing  a  federated  formal  queries
using   the disambiguated resources  and leveraging the  linking  structure
 of  the  underlying  datasets. This  approach  essentially relies on a
 combination  of  domain  and range inference   as   well  as  a   link
 traversal   method   for   constructing  a  connected  graph  which
ultimately  renders  a corresponding  SPARQL  query. The results of our
evaluation with three life­science  datasets  and 25 benchmark queries
demonstrate the effectiveness of our approach.

*Biography*: Saeedeh Shekarpour is a PhD student at the Institute  for
 Applied  Computer  Science  at  University  of  Bonn  &  AKSW  research
 group,  Institute  of Computer Science(IfI), Leipzig University,  Leipzig,
Germany, under superviosn of Dr. Soren Auer. She spent the three and half
years of my PhD in the field of “Question Answering on Interlinked Data”.
Her research interests are the following fields: Question Answering,
Semantic Search, Semantic Web, Information Retrieval. During her PhD, She
worked with the AKSW research group [2] (a leading group in Semantic Web).
In addition to gaining experience in the field of semantic web while
working with this group, She initiated a project called SINA (a Semantic
Search Engine over Interlinked Data).

Michel Dumontier
Associate Professor of Medicine (Biomedical Informatics), Stanford
Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
Received on Monday, 20 January 2014 13:25:57 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:07 UTC