Aproach to the implementation of K-Nearest Neighbor (KNN) using the Euclidean algorithm.
Sample Usage:
mywork = Words_Works()
lit = 'literature.txt'
mywork.add_category(lit, 'Literature') # adding files as category
comp = 'computers.txt'
mywork.add_category(comp, 'Computers')
phy = 'physics.txt'
mywork.add_category(phy, 'Physics')
# saving categories dictionary to file
mywork.save_categories() # can be loaded calling load_categories()
print mywork.categories # prints categories dictionary
print
txts = ('sample1.txt', 'sample2.txt') # creating list of files to add
for text in txts:
mywork.add_text(text) # adding files
print mywork.all_texts # prints files word ocurrence count
print
mywork.knn_calc() # perform calculation
print mywork.knn_results # print overall results
print
mywork.knn() # print files results
Cheers and Happy coding!