Re: Fwd: Re: Question about MARCXML to Models transformation from Karen Coyle on 2011-03-09 (public-lld@w3.org from March 2011)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Wed, 09 Mar 2011 06:42:34 -0800
To: Richard Light <richard@light.demon.co.uk>
Cc: Antoine Isaac <aisaac@few.vu.nl>, public-lld <public-lld@w3.org>
Message-ID: <20110309064234.101571g10sooyk0q@kcoyle.net>

Great stuff, Richard, thanks. A few more of these and maybe we can  
develop a very rudimentary demo of what cataloging might look like at  
some time in the future.

kc

Quoting Richard Light <richard@light.demon.co.uk>:

> In message <4D776FDB.2060500@few.vu.nl>, Antoine Isaac  
> <aisaac@few.vu.nl> writes
>>>
>>>> But I'm really wondering why 1 would not be possible for quite
>>>> easily identifiable entities like places and persons. With some
>>>> basic tools that use existing linked data sources like Geonames, you
>>>> could easily get something like a "did you mean
>>>> http://sws.geonames.org/2988507/?" question (with a better
>>>> interface, of course) that a cataloger can answer by yes or no, when
>>>> "Paris" is filled in as place of publication.
>>>
>>> This "added" data is really the equivalent of the coded values in
>>> MARC, in my mind. Where possible, systems need to create short-cuts so
>>> that catalogers do not have to fill in both the text value and the
>>> coded value (most of what is coded in MARC is redundant with data in
>>> the textual fields). We need to make it so that catalogers have to do
>>> *less* not *more* if we wish to get them on board. That's only fair.
>>
>> Good points. Perhaps that could be also something interesting to  
>> mention in the report, in recommendations on how to change (if  
>> possible) the way library data could be created or processed. So as  
>> to make sure that the original work of librarians has maximum  
>> impact in a more open environment...
>
> Similar issues arise in a museum context. One aspect of the problem  
> when we are trying to convert string data to URLs is that we have a  
> different sort of context from the running text which e.g. dbpedia  
> Spotlight can annotate using NLP techniques.  However, this should,  
> in principle, make life easier, since the data is of a known type.
>
> Following the recent Culture Grid Hack Day [1] I've written a simple  
> CGI for place names [2] which will attempt to disambiguate strings  
> such as:
>
> Paris, France
>
> so that:
>
> http://light.demon.co.uk/scripts/getPlaceURL.exe?q=Paris,%20France
>
> returns the XML:
>
> <result q="Paris, France" q1="Paris" q2="France" country="FR"  
> url="http://api.geonames.org/search?style=short&name_equals=Paris&country
> =FR&username=demo" hits="5" geonameId="2988507"  
> hierUrl="http://api.geonames.org/hierarchy?geonameId=2988507&username=dem
> o" hit1="true" hit2="true"  
> certainty="100">http://www.geonames.org/2988507/</result>
>
> However, if you just give it "Paris" there is no guarantee it will  
> be able to help you. (It does, but it shouldn't!)
>
> Lightweight URL-ifier tools like this (I have in mind one for dates  
> and date ranges) may enable cataloguers to include URLs for concepts  
> at relatively low cost.
>
> Richard
>
> [1] http://www.culturegridhackday.org.uk/
> [2] http://light.demon.co.uk/wordpress/?p=54
>
> -- 
> Richard Light
>
>



-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Received on Wednesday, 9 March 2011 14:43:11 UTC