Re: Given a university's name, retrieve URL for university's home page.

Affiliation (for the text name) and domain of the lead Author's email should give you a little "uncertainty" with which to  resolve DBpedia.  Their rules are very fussy and not as much "uncertainty" as you would like, but it is a start.  The REGEX to chop up email addresses is here: http://www.ietf.org/rfc/rfc2396.txt (page 28).
--Gannon




________________________________
 From: Sam Kuper <sam.kuper@uclmail.net>
To: Gannon Dick <gannon_dick@yahoo.com> 
Cc: public-lod <public-lod@w3.org> 
Sent: Monday, May 13, 2013 11:25 PM
Subject: Re: Given a university's name, retrieve URL for university's home page.
 

>>  From: Sam Kuper <sam.kuper@uclmail.net>
>> To: public-lod <public-lod@w3.org>
>> Sent: Monday, May 13, 2013 1:39 PM
>> Subject: Given a university's name, retrieve URL for university's home
>> page.
>>
>> I wish to solve the following problem: given a string that represents
>> one of perhaps several common orthographic representations of a
>> university's name (e.g. "Cambridge University" might be given, instead
>> of "University of Cambridge"), retrieve the URL of that university's
>> home page on the WWW.
>>
>> My first attempt at a solution is a two-step process. It is to query
>> the Wikipedia API in order to obtain, with any luck, the title for the
>> university's article in Wikipedia
>> [and then the]
>> second step is to use that title to >submit a SPARQL query to
>> DBpedia in the hope of obtaining the university's website's URL[...]
>>
> On 13/05/2013, Gannon Dick <gannon_dick@yahoo.com> wrote:
> The problem is already solved in fine detail, but the parameter names may be
> a little difficult to relate to LOD usage.
>
> http://www.ncbi.nlm.nih.gov/books/NBK25497/
>
> Good luck :-)

Hi Gannon, and thanks for the suggestion.

I've now had time to skim through the lists of searchable fields
available for almost all of the Entrez databases, using the list of
databases here:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi and the
"advanced" search page for each one, which - where available - lists
that database's searchable fields.

Unfortunately, I'm still rather unsure how the NCBI represents a
detailed solution to the problem I outlined. Perhaps I'm just being a
bit slow, but are you proposing I use a search on the "Affiliation"
field in Entrez to identify the university, and then cross-reference
this to a URL field in Entrez for that university? If so, then I
haven't yet found a practical way of doing this. In Entrez, an
"Affiliation" seems (forgive the OOP speak, which might not be
entirely appropriate here, but serves my meaning) to be an attribute
of an author/team rather than an object with a "URL" attribute.

If you're certain that the Entrez dataset *can* be used to solve the
problem I outlined, then please could you provide a demo, or at least
a bit more detail concerning the *way* it would be used in such a
solution, e.g. which field(s) of which database(s) would you propose I
query?

Thanks again,

Sam

Received on Tuesday, 14 May 2013 13:48:45 UTC