One word anagrams by lookup

TrustyTony 0 Tallied Votes 807 Views Share

As I told before here is second implementation of one word anagrams, which prepares lookup table for all available words, if it is not generated and afterwards uses it for fast lookup.

The implementation of lookup table generation is quite unoptimized, but it is only done once per vocabulary.

If you want to change dictionary, add dictionary selection routine instead of fixed one here or just rename/delete old dictionary and anawords and analist files. Then make copy of your dictionary in same dictionary as this program with name list.txt.

First run:

Dict prepared in 0.726 s
Dict saved for future
Preparations took 0.889 s
Output:
Saved dict loaded
Preparations took 0.132 s
To stop: enter empty line
Give word: meti
['emit', 'item', 'mite', 'time']
0 ms
Give word: ewd
['dew', 'wed', "we'd"]
0 ms
Give word: team
['mate', 'meat', 'tame', 'team']
0 ms
Give word: mocupret
['computer']
0 ms
Give word: nocpurte
Word is not in vocabulary
0 ms
Give word:
## my solution for all anagrams 
from time import clock
import os,sys

dictionary = 'list.txt'

takeout=' \t\'-+\n\r' ## deleted letters from words
## choosing first argument for translate
if sys.version[:3]>='2.6':
    table=None #python 2.6 or higher
else:
    print 'Old python'
    table=''
    for i in range(256): t+=chr(i)


def letters(a):
    let=''.join(sorted(a))
    let = let.translate(table,takeout)
    return let

def writeout(inp,out):
    words= [w.rstrip() for w in open(inp)]
    words=[letters(a)+' '+a for a in words]
    words.sort(key=len)
    open(out,'w').write('\n'.join(words))

def getanawords(aw='anawords.txt',al='analist.txt'):
    if not os.path.isfile(aw) and not os.path.isfile(al): writeout(inp=dictionary,out=al)
    
    anawords=dict()
    
    if not os.path.isfile(aw):
        for l,w in [j.split() for j in open(al)]:
            if l in anawords: anawords[l].append(w)
            else: anawords[l]=[w]
        print "Dict prepared in %1.3f s" % (clock()-t)

        f=open(aw,'w')
        for i in sorted(anawords,key=len):
            f.write(i+' '+' '.join(anawords[(i)])+'\n')
        print "Dict saved for future"
    else:
        for i in open(aw):
            i=i.rstrip().split()
            anawords[i[0]]=i[1:]
        print "Saved dict loaded"
    return anawords
 
if __name__=="__main__":
    ## listing all words
    t=clock()

    anawords=getanawords()
    print "Preparations took %1.3f s" % (clock()-t)
               
    print "To stop: enter empty line"
    while True:
        i=raw_input('Give word: ')
        t=clock()
        i=letters(i)
        if i :
            if i in anawords: print anawords[i]
            else: print 'Word is not in vocabulary'
            print '%i ms' % ((clock()-t)*1000)
        else: break
TrustyTony 888 ex-Moderator Team Colleague Featured Poster

Posted to anagrams part in rosettacode.
http://rosettacode.org/wiki/Anagrams#.7B.7Bheader.7CPython.7D.7D

Also improved previous version of Haskel algorithm version:

import urllib, itertools
from time import clock
words = urllib.urlopen('http://www.puzzlers.org/pub/wordlists/unixdict.txt').read().split()
t=clock()
anagrams = [list(g) for k,g in itertools.groupby(sorted(words, key=sorted), key=sorted)]
print('List preparation time: %.3f ms' % ((clock()-t)*1000))
t=clock()
topten= sorted(anagrams, key=len)[-10:]
count = len(topten[-1]) ## last in sorted list is longest
print '\n'.join([', '.join(ana) for ana in topten if len(ana) == count])
print('Longest finding time: %.3f ms' % ((clock()-t)*1000))
input('Ready')
"""List preparation time: 459.000 ms
abel, able, bale, bela, elba
caret, carte, cater, crate, trace
angel, angle, galen, glean, lange
alger, glare, lager, large, regal
elan, lane, lean, lena, neal
evil, levi, live, veil, vile
Longest finding time: 18.745 ms
Ready
"""
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.