Re: Fwd: Re: Question about MARCXML to Models transformation

In message <>, Antoine Isaac 
<> writes
>>> But I'm really wondering why 1 would not be possible for quite
>>> easily identifiable entities like places and persons. With some
>>> basic tools that use existing linked data sources like Geonames, you
>>> could easily get something like a "did you mean
>>>" question (with a better
>>> interface, of course) that a cataloger can answer by yes or no, when
>>> "Paris" is filled in as place of publication.
>> This "added" data is really the equivalent of the coded values in
>> MARC, in my mind. Where possible, systems need to create short-cuts so
>> that catalogers do not have to fill in both the text value and the
>> coded value (most of what is coded in MARC is redundant with data in
>> the textual fields). We need to make it so that catalogers have to do
>> *less* not *more* if we wish to get them on board. That's only fair.
>Good points. Perhaps that could be also something interesting to 
>mention in the report, in recommendations on how to change (if 
>possible) the way library data could be created or processed. So as to 
>make sure that the original work of librarians has maximum impact in a 
>more open environment...

Similar issues arise in a museum context. One aspect of the problem when 
we are trying to convert string data to URLs is that we have a different 
sort of context from the running text which e.g. dbpedia Spotlight can 
annotate using NLP techniques.  However, this should, in principle, make 
life easier, since the data is of a known type.

Following the recent Culture Grid Hack Day [1] I've written a simple CGI 
for place names [2] which will attempt to disambiguate strings such as:

Paris, France

so that:,%20France

returns the XML:

<result q="Paris, France" q1="Paris" q2="France" country="FR" 
=FR&username=demo" hits="5" geonameId="2988507" 
o" hit1="true" hit2="true" 

However, if you just give it "Paris" there is no guarantee it will be 
able to help you. (It does, but it shouldn't!)

Lightweight URL-ifier tools like this (I have in mind one for dates and 
date ranges) may enable cataloguers to include URLs for concepts at 
relatively low cost.



Richard Light

Received on Wednesday, 9 March 2011 13:05:55 UTC