- From: Sean B. Palmer <sean@mysterylights.com>
- Date: Thu, 13 Feb 2003 15:34:13 -0000
- To: <www-archive@w3.org>
Disasterously slow. I was going to use this perhaps as a bit of base code for a Busy Spunge/Blogspace/Memesh style application--one of those things that I've been wanting to do for years--but on second thoughts I don't even like the approach. I found that "grep" is many times faster, even though it searches by line and not by file. And the main problem that I'm having at the moment is not with the datamodel, per se, but the interface: I've no idea how to get the sort of GUI that I want at yet keep it portable and so forth. And I'd rather work on a line-mode style interface... http://www.w3.org/DesignIssues/Conversations ...since we've already good information management tools that use that. Hmm... that reminds me of the little `u note` hack that I set up not so long ago:- #!/usr/bin/sh echo $* \(`u timenow`\) >> notes (u timenow just outputs a YYYYMMDD-HHmmSS string). I've been thinking about JavaScript and SVG: it's fairly well supported, and one can do great things with the combination if Jim Ley's stuff is anything to go by. But I haven't really got time to spend on this: I should just collect all my snippets of notes that I've made over the months, and dump them into www-archive. Anyway, here's findword.py. GPL 2: share and enjoy (ha). <findword.py>: [[ #!/usr/bin/python """ findword.py, by Sean B. Palmer find words, much like a search engine """ import sys, re # { 'word': [list of items] } scanWords = re.compile(r'[A-Za-z0-9]+').findall def build(filenames): mappings = {} for fn in filenames: s = open(fn).read() for word in scanWords(s): if mappings.has_key(word): mappings[word][fn] = 0 else: mappings[word] = {fn:0} return mappings def common(mappings, words): assert len(words) > 0 lengths = {} for word in words: length = len(mappings[word].keys()) if not lengths.has_key(length): lengths[length] = [word] else: lengths[length].append(word) lens = lengths.keys() lens.sort() swords = [] for x in lens: swords.extend(lengths[x]) candidates = {} first, rest = swords[0], swords[1:] for item in mappings[first].keys(): candidates[item] = 0 while 1: if len(rest) == 0: break first, rest = rest[0], rest[1:] for item in candidates.keys(): if not mappings[first].has_key(item): del candidates[item] return candidates.keys() def test(): x = { 'm': {'x':0, 'y':0, 'z':0, 'p':0}, 'n': {'p':0, 'q':0, 'r':0, 'x':0}, 'o': {'y':0, 'p':0, 's':0, 't':0, 'w':0, 'z':0} } print common(x, ['m', 'n']) print common(x, ['m', 'n', 'o']) print common(x, ['m']) print common(x, ['n', 'o']) def main(argv): if argv[0].endswith('make'): print `build(argv[1:])` elif argv[0].endswith('find'): mappings = eval(open(argv[1]).read()) words = argv[2:] print common(mappings, words) else: print __doc__ if __name__=="__main__": main(sys.argv[1:]) ]] -- Sean B. Palmer, <http://purl.org/net/sbp/> "phenomicity by the bucketful" - http://miscoranda.com/
Received on Thursday, 13 February 2003 10:34:19 UTC