How do I do it?
I tried sorting the keys:
keys = dictionary.keys()
keys.sort()
return map(dictionary.get, keys)
But it didn't work.
How do I do it?
I tried sorting the keys:
keys = dictionary.keys()
keys.sort()
return map(dictionary.get, keys)
But it didn't work.
Here is an example of my code with this
Every tutorial I go online tells me this is how I am supposed to sort a dictionary. But it doesn't work. :'(
import string
def read_book():
f = open("alice_in_wonderland.txt", "r")
word_list = []
for l in f.readlines()[10:3340]:
book_line = l.strip().translate(None, string.punctuation)
word_freq = {}
for w in book_line.split(" "):
if w != "":
word_list.append(w.lower())
for w in word_list:
word_freq[w] = word_freq.get(w, 0) + 1
keys = dictionary.keys()
keys.sort()
return map(dictionary.get, keys)
Your logic is not good. Write an algorithm
def read_book():
create word frequency dictionary (only once, not in a loop)
open the file
for each line in the file:
split the line into words
for each word in the line:
increment the word's frequency (we don't need a word list)
sort the frequency dict items (again only once, not in a loop)
The implementation must follow the same logic and the same indentation scheme.
Following your instructions to the best of my abilities here is my new code:
I just get a bunch of random numbers.
import string
def read_book():
word_freq = {}
f = open("alice_in_wonderland.txt", "r")
for l in f.readlines()[10:3340]:
book_line = l.strip().translate(None, string.punctuation)
for w in book_line.split(" "):
if w != "":
word_freq[w] = word_freq.get(w, 0) + 1
keys = word_freq.keys()
keys.sort()
return map(word_freq.get, keys)
Here we sort with reverse frequency, and show ten most common:
import string
def read_book():
word_freq = {}
f = open("alice.txt", "r")
for l in f.readlines()[10:3340]:
book_line = l.strip().translate(None, string.punctuation).lower()
for w in book_line.split(" "):
if w != "":
word_freq[w] = word_freq.get(w, 0) + 1
return sorted(word_freq.items(), reverse=True, key=lambda x: x[1])
print read_book()[:10]
"""Output:
[('the', 1591), ('and', 827), ('to', 713), ('a', 624), ('she', 529), ('it', 526), ('of', 492), ('said', 462), ('i', 400), ('alice', 385)]
"""
Here we sort with reverse frequency, and show ten most common:
import string def read_book(): word_freq = {} f = open("alice.txt", "r") for l in f.readlines()[10:3340]: book_line = l.strip().translate(None, string.punctuation).lower() for w in book_line.split(" "): if w != "": word_freq[w] = word_freq.get(w, 0) + 1 return sorted(word_freq.items(), reverse=True, key=lambda x: x[1]) print read_book()[:10] """Output: [('the', 1591), ('and', 827), ('to', 713), ('a', 624), ('she', 529), ('it', 526), ('of', 492), ('said', 462), ('i', 400), ('alice', 385)] """
Thanks Tony.
This is my code using a modified version of your return sorted() statement.
import string
def read_book():
word_freq = {}
f = open("alice_in_wonderland.txt", "r")
for l in f.readlines()[10:3340]:
book_line = l.strip().translate(None, string.punctuation)
for w in book_line.split(" "):
if w != "":
w.lower()
word_freq[w] = word_freq.get(w, 0) + 1
return sorted(word_freq.items(), key=lambda x: x[1])[-20:-1]
It works!...sorta. mine says and is the most frequently occurring word(774 times).
impossible since we are using the same book, but I think I know why.
Are you sure you do not need the .lower() at line 7. It does change the result.
Without lower:
[('the', 1476), ('and', 757), ('to', 709), ('a', 607), ('she', 490), ('it', 482), ('of', 477), ('said', 456), ('I', 400), ('Alice', 385)]
Are you sure you do not need the .lower() at line 7. It does change the result.
Without lower:
[('the', 1476), ('and', 757), ('to', 709), ('a', 607), ('she', 490), ('it', 482), ('of', 477), ('said', 456), ('I', 400), ('Alice', 385)]
I put it in line 9 because that's right before I use w in word_freq. It made sense to me.
I think it worked.
Here is my final code:
import string
def read_book():
word_freq = {}
f = open("alice_in_wonderland.txt", "r")
for l in f.readlines()[10:3340]:
book_line = l.strip().translate(None, string.punctuation)
for w in book_line.split(" "):
if w != "":
w.lower()
word_freq[w] = word_freq.get(w, 0) + 1
return sorted(word_freq.items(), key=lambda x: x[1])[::-1][0:20]
Oh, yes you have w.lower() at line 10, but it does nothing as it is not saved anywhere, so you still have case sensitive count.
I would also say line 9 as
if w:
Ahh yes you are right.
Oh, yes you have w.lower() at line 10, but it does nothing as it is not saved anywhere, so you still have case sensitive count.
I would also say line 9 asif w:
I put .lower() in line 10:
For some reason I get more than you.
import string
def read_book():
word_freq = {}
f = open("alice_in_wonderland.txt", "r")
for l in f.readlines()[10:3340]:
book_line = l.strip().translate(None, string.punctuation)
for w in book_line.split(" "):
if w != "":
word_freq[w.lower()] = word_freq.get(w.lower(), 0) + 1
return sorted(word_freq.items(), key=lambda x: x[1])[::-1][0:20]
Output:
[('the', 1629), ('and', 844), ('to', 721), ('a', 627), ('she', 537), ('it', 526), ('of', 508), ('said', 462), ('i', 399), ('alice', 385), ('in', 365), ('you', 360), ('was', 357), ('that', 276), ('as', 262), ('her', 248), ('at', 209), ('on', 193), ('with', 180), ('all', 179)]
You must have missing some license texts from file. If I remove all those and run the indexing in all words for all file, I get same frequences as you.
You can't sort a dictionary, it is in no particular order, but you can read from it in the correct order.
dict = {3: "three", 1: "one", 2: "two"}
keys = [i for i in dict]
keys.sort()
for i in keys: print (dict[i])
This thread is off the hook. Is this about reordering a dict in order(sorting) or printing and finding some words and their locations + ordering them ????
???? ;)
So is this solved?
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.