A secret ingredient for Beautiful Soup ...

Brand mentioned "Beautiful Soup" on the LOD Call last Friday.  It seemed like a very practical idea.  Since I can't Python my way out of a paper bag (I do Perl, in moderation) I had a related suggestion ...

A few weeks ago Erik Wilde wrote: "XML is 
much less than most people think it is or want it to be: it is a syntax 
for ordered trees, and that's it. end of story. period."  I agree.  And furthermore, I think that although a lot of effort is expended to purge localisms from datasets, that a little context slippage should be a recoverable error.

With about 10k rows, not a very big data base, and some meta data magic, it's possible to resolve small geographical labeling errors, before they become Big Confusion(tm). The semantic magic part is that you don't have to know anything about the fine structure of a domain (Country), to put a data set in context.  The data base set-up, with a couple of examples is here:
http://www.rustprivacy.org/2010/spookville.pdf

--Gannon


      

Received on Monday, 20 September 2010 07:46:24 UTC