Hi,

I have a problem where i need the format of data in a saved file to be seperated by a new line. my code at present:

import re
import nltk
 
#subset
filename = 'subsetQuran.txt'
 
# create list of lower case words
word_list = re.split('\s+', file(filename).read().lower())
print 'Words in text:', len(word_list)
 
# create dictionary of word:frequency pairs
freq_dic = {}
# punctuation and numbers to be removed
punctuation = re.compile(r'[-.?!,":;()|0-9]') 
for word in word_list:
    # remove punctuation marks
    word = punctuation.sub("", word)
    # form dictionary
    try: 
        freq_dic[word] += 1
    except: 
        freq_dic[word] = 1
 
 
print '-'*30
 
print "sorted by highest frequency first:"
# create list of (val, key) tuple pairs
freq_list2 = [(val, key) for key, val in freq_dic.items()]
# sort by val or frequency
freq_list2.sort(reverse=True)
# display result
for freq, word in freq_list2:
    print word, freq
f = open("wordfreq.txt", "w")
f.write( str(freq_list2) )
f.close()

As it is it saves the data to text file like [(182, 'the'), (128, 'and'), (123, 'will')].
But i need it to save as:
the, 182
and, 128
will, 123

Any help is appreciated.
Thanks in advance.

to print in formatted style always use it like
print "Item %d: %s" % (i, strings)

print('\n'.join('%s, %s' % t for t in [(182, 'the'), (128, 'and'), (123, 'will')]))

Newer and quite useful format:

print('\n'.join('{1}, {0}'.format(*t) for t in [(182, 'the'), (128, 'and'), (123, 'will')]))

nice one Jay :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.